Image processing apparatus, image processing method, and non-transitory computer-readable medium

ABSTRACT

There is provided with an image processing apparatus. A captured image of a target object that is captured by an image capturing apparatus is obtained. Information that indicates a deterioration degree of the captured image is obtained for a position in the captured image. A feature of the target object is extracted from the captured image based on the deterioration degree. The feature of the target object and a feature of the three-dimensional model observed when the three-dimensional model is arranged in accordance with a predetermined position and orientation are associated. A position and orientation of the target object with respect to the image capturing apparatus are derived by correcting the predetermined position and orientation based on a result of association.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to an image processing apparatus, an imageprocessing method, and a non-transitory computer-readable medium.

2. Description of the Related Art

With the development of robot technology in recent years, complicatedtasks that thus far people have performed, such as assembling anindustrial product, are in the process of coming to be performed insteadby robots. To perform assembling, such a robot grips a component byusing an end effector such as a hand. For the robot to grip thecomponent, there is a need to measure a relative position andorientation (hereinafter, simply referred to as position andorientation) between the component that is a target of gripping and therobot (hand). Measurement of such position and orientation canadditionally be applied to various objectives, such as self-positionestimation for the robot to autonomously move, or alignment between avirtual object and a physical space (a physical object) in augmentedreality.

As a method of measuring the position and orientation of an object, T.Drummond and R. Cipolla, “Real-time visual tracking of complexstructures”, IEEE Transactions on Pattern Analysis and MachineIntelligence, vol. 24, no. 7, pp. 932-946, 2002 discloses a method ofcausing a projection image of a three-dimensional model of an objectrepresented by a collection of line segments to match an edge feature onan image obtained by an image capturing apparatus. Specifically, basedon a coarse position and orientation provided as known information, aline segment in a three-dimensional model is projected onto an image.Next, edge features corresponding to each control point discretelyarranged on the projected line segment are detected from the image. Thenfinal position and orientation measurement values are obtained bycorrecting the coarse position and orientation so that a sum of squaresof distances on the image between the projection image of the linesegment to which the control points belong and corresponding edgefeatures is a minimum.

SUMMARY OF THE INVENTION

According to an embodiment of the present invention, an image processingapparatus comprises: an image obtaining unit configured to obtain acaptured image of a target object that is captured by an image capturingapparatus; a deterioration degree obtaining unit configured to obtaininformation that indicates a deterioration degree of the captured image,for a position in the captured image; an extraction unit configured toextract a feature of the target object from the captured image based onthe deterioration degree; a model holding unit configured to hold athree-dimensional model of the target object; an associating unitconfigured to associate the feature of the target object and a featureof the three-dimensional model observed when the three-dimensional modelis arranged in accordance with a predetermined position and orientation;and a deriving unit configured to derive a position and orientation ofthe target object with respect to the image capturing apparatus bycorrecting the predetermined position and orientation based on a resultof association.

According to another embodiment of the present invention, an imageprocessing apparatus comprises: an image obtaining unit configured toobtain a captured image of a target object that is captured by an imagecapturing apparatus; a deterioration degree obtaining unit configured toobtain information that indicates a deterioration degree of the capturedimage for a position in the captured image; a holding unit configured tohold a plurality of comparison target images; and a determination unitconfigured to determine, from a plurality of the comparison targetimages, an image that corresponds to an image of the target object,based on a degree of matching between the image of the target object andcomparison target images from the plurality of the comparison targetimages, wherein the degree of matching is based on a difference with acorresponding feature of a comparison target image for each of aplurality of features of the image of the target object, which isweighted in accordance with the deterioration degree corresponding to aposition at which the feature is extracted.

According to still another embodiment of the present invention, an imageprocessing apparatus comprises: an image obtaining unit configured toobtain a captured image of a target object that is captured by an imagecapturing apparatus; a deterioration degree obtaining unit configured toobtain information that indicates a deterioration degree of the capturedimage; a setting unit configured to set an extraction parameter used toextract a feature from the captured image in accordance with thedeterioration degree; and an extraction unit configured to extract afeature of the captured image by using the extraction parameter set bythe setting unit with reference to the captured image.

According to yet another embodiment of the present invention, an imageprocessing method comprises: obtaining a captured image of a targetobject that is captured by an image capturing apparatus; obtaininginformation that indicates a deterioration degree of the captured image,for a position in the captured image; extracting a feature of the targetobject from the captured image based on the deterioration degree;holding a three-dimensional model of the target object; associating thefeature of the target object and a feature of the three-dimensionalmodel observed when the three-dimensional model is arranged inaccordance with a predetermined position and orientation; and deriving aposition and orientation of the target object with respect to the imagecapturing apparatus by correcting the predetermined position andorientation based on a result of the associating.

According to still yet another embodiment of the present invention, animage processing method comprises: obtaining a captured image of atarget object that is captured by an image capturing apparatus;obtaining information that indicates a deterioration degree of thecaptured image for a position in the captured image; holding a pluralityof comparison target images; and determining, from a plurality of thecomparison target images, an image that corresponds to an image of thetarget object, based on a degree of matching between the image of thetarget object and comparison target images from the plurality of thecomparison target images, wherein the degree of matching is based on adifference with a corresponding feature of a comparison target image foreach of a plurality of features of the image of the target object, whichis weighted in accordance with the deterioration degree corresponding toa position at which the feature is extracted.

According to yet still another embodiment of the present invention, animage processing method comprises: obtaining a captured image of atarget object that is captured by an image capturing apparatus;obtaining information that indicates a deterioration degree of thecaptured image; setting an extraction parameter used to extract afeature from the captured image in accordance with the deteriorationdegree; and extracting a feature of the captured image by using theextraction parameter with reference to the captured image.

According to still yet another embodiment of the present invention, anon-transitory computer-readable medium stores a program thereon,wherein the program is configured to cause a computer to: obtain acaptured image of a target object that is captured by an image capturingapparatus; obtain information that indicates a deterioration degree ofthe captured image, for a position in the captured image; extract afeature of the target object from the captured image based on thedeterioration degree; hold a three-dimensional model of the targetobject; associate the feature of the target object and a feature of thethree-dimensional model observed when the three-dimensional model isarranged in accordance with a predetermined position and orientation;and derive a position and orientation of the target object with respectto the image capturing apparatus by correcting the predeterminedposition and orientation based on a result of association.

According to yet still another embodiment of the present invention, anon-transitory computer-readable medium stores a program thereon,wherein the program is configured to cause a computer to: obtain acaptured image of a target object that is captured by an image capturingapparatus; obtain information that indicates a deterioration degree ofthe captured image for a position in the captured image; hold aplurality of comparison target images; and determine, from a pluralityof the comparison target images, an image that corresponds to an imageof the target object, based on a degree of matching between the image ofthe target object and comparison target images from the plurality of thecomparison target images, wherein the degree of matching is based on adifference with a corresponding feature of a comparison target image foreach of a plurality of features of the image of the target object, whichis weighted in accordance with the deterioration degree corresponding toa position at which the feature is extracted.

According to still yet another embodiment of the present invention, anon-transitory computer-readable medium stores a program thereon,wherein the program is configured to cause a computer to: obtain acaptured image of a target object that is captured by an image capturingapparatus; obtain information that indicates a deterioration degree ofthe captured image; set an extraction parameter used to extract afeature from the captured image in accordance with the deteriorationdegree; and extract a feature of the captured image by using theextraction parameter with reference to the captured image.

According to yet still another embodiment of the present invention, animage processing apparatus comprises: an image obtaining unit configuredto obtain a captured image of a target object that is captured by animage capturing apparatus; a deterioration degree obtaining unitconfigured to obtain information that indicates a deterioration degreeof the captured image, for a position in the captured image; anextraction unit configured to extract a feature of the target objectfrom the captured image; a model holding unit configured to hold athree-dimensional model of the target object; an associating unitconfigured to associate the feature of the target object and a featureof the three-dimensional model observed when the three-dimensional modelis arranged in accordance with a predetermined position and orientation;and a deriving unit configured to derive a position and orientation ofthe target object with respect to the image capturing apparatus bycorrecting the predetermined position and orientation based on a resultof association and the deterioration degree corresponding to a positionat which the feature was extracted.

According to still yet another embodiment of the present invention, animage processing method comprises: obtaining a captured image of atarget object that is captured by an image capturing apparatus;obtaining information that indicates a deterioration degree of thecaptured image, for a position in the captured image; extracting afeature of the target object from the captured image; holding athree-dimensional model of the target object; associating the feature ofthe target object and a feature of the three-dimensional model observedwhen the three-dimensional model is arranged in accordance with apredetermined position and orientation; and deriving a position andorientation of the target object with respect to the image capturingapparatus by correcting the predetermined position and orientation basedon a result of the associating and the deterioration degreecorresponding to a position at which the feature was extracted.

According to yet still another embodiment of the present invention, anon-transitory computer-readable medium stores a program thereon,wherein the program is configured to cause a computer to: obtain acaptured image of a target object that is captured by an image capturingapparatus; obtain information that indicates a deterioration degree ofthe captured image, for a position in the captured image; extract afeature of the target object from the captured image; hold athree-dimensional model of the target object; associate the feature ofthe target object and a feature of the three-dimensional model observedwhen the three-dimensional model is arranged in accordance with apredetermined position and orientation; and derive a position andorientation of the target object with respect to the image capturingapparatus by correcting the predetermined position and orientation basedon a result of association and the deterioration degree corresponding toa position at which the feature was extracted.

Further features of the present invention will become apparent from thefollowing description of exemplary embodiments (with reference to theattached drawings).

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a view showing a configuration of an information processingapparatus 1 according to the first embodiment.

FIG. 2 is a flowchart of processing according to the first embodiment.

FIG. 3 is a flowchart of deterioration degree calculation processingaccording to the first embodiment.

FIG. 4 is a flowchart of feature extraction processing according to thefirst embodiment.

FIGS. 5A-5F are views showing a filter for extracting an edge feature.

FIG. 6 is a flowchart of position and orientation calculation processingaccording to the first embodiment.

FIGS. 7A and 7B are views explaining a method of associating the edgefeature.

FIG. 8 is a view showing a relation between a model edge feature and animage edge feature.

FIG. 9 is a flowchart of feature extraction processing according to asixth variation.

FIG. 10 is a flowchart of feature extraction processing according to aseventh variation.

FIG. 11 is a flowchart of deterioration degree calculation processingaccording to an eighth variation.

FIG. 12 is a view showing a method of measuring a three-dimensionalposition of a target object.

FIG. 13 is a view for showing an example of an illumination pattern.

FIG. 14 is a view showing a configuration of a robot system according toa fourth embodiment.

FIG. 15 is a view showing a configuration of a computer according to asixth embodiment.

DESCRIPTION OF THE EMBODIMENTS

In the method recited in Drummond and Cipolla, if degradation in animage caused by an image blur, bokeh, or the like occurs, there is aproblem in that precision of measurement of the position and orientationdegrades. This can be considered to be because a position of an edgefeature extracted from the image shifts from the original position ofthe edge feature when the image is degraded.

According to an embodiment of the present invention, it is possible toimprove measurement precision when degradation in a captured image of anobject occurs in a case when the captured image is used to measure theposition and orientation of the object.

Explanation is given below of embodiments of the present invention basedon the drawings. However, the scope of the present invention is notlimited to the following embodiments.

First Embodiment

In the first embodiment, an estimation of the position and orientationof a measurement target object is performed considering a deteriorationdegree of a captured image of the measurement target object.Specifically, when extracting a feature from a captured image, featureextraction is performed at high precision by considering thedeterioration degree of the captured image. In addition, when causing athree-dimensional model of the measurement target object to fit toextracted features, weighting in accordance with the deteriorationdegree of the captured image is performed. Below, the measurement targetobject is simply called the target object.

Explanation is given below of a case of estimating the position andorientation of the target object based on a captured image captured bythe image capturing apparatus. In such a case, the relative position andorientation of the target object with respect to the image capturingapparatus is calculated. In the explanation below, to the extent that itis not described otherwise, the position and orientation of the targetobject indicate the relative position and orientation of the targetobject with respect to the image capturing apparatus. Based on theposition and orientation of the image capturing apparatus in acoordinate system and the relative position and orientation of thetarget object with respect to the image capturing apparatus, it is easyto calculate the position and orientation of the target object in thiscoordinate system. Accordingly, calculating the relative position andorientation of the target object with respect to the image capturingapparatus is equivalent to calculating the position and orientation ofthe target object in a coordinate system.

The deterioration degree indicates the degree of deterioration of thecaptured image which is captured by the image capturing apparatus. Inthis way, when estimating the position and orientation of the targetobject, it is possible to estimate the position and orientation of themeasurement target object with high precision by considering thedeterioration degree of the captured image of the target object. In thepresent embodiment, the deterioration degree is generated in advance bya deterioration degree holding unit 102 described later, and is held inadvance. For example, the deterioration degree is calculated before thecaptured image of the target object is obtained by an image capturingapparatus 108, or before the captured image of the target object isobtained by an image obtaining unit 103. In the case of the presentembodiment, a deterioration degree calculation unit 101 calculates thedeterioration degree before the captured image of the target object isobtained by the image obtaining unit 103. According to such aconfiguration, because processing that calculates the deteriorationdegree after the image obtaining unit 103 obtains the captured image ofthe target object can be omitted, it is possible to perform theestimation of the position and orientation of the target object at highspeed. In another embodiment, the deterioration degree may be calculatedafter the captured image of the target object is obtained by the imagecapturing apparatus 108, or after the captured image of the targetobject is obtained by the image obtaining unit 103. For example, whenthe deterioration degree has become necessary in processing of a featureextraction unit 104 or a reliability calculation unit 105, thedeterioration degree calculation unit 101 may calculate thedeterioration degree in real-time.

In the present embodiment, the deterioration degree according to therelative position and orientation of the target object with respect tothe image capturing apparatus is set for each feature (model feature) onthe three-dimensional model of the target object. This deteriorationdegree represents a degree of deterioration of an image of a feature ofthe target object corresponding to a model feature on the captured imagewhen the target object is captured by the image capturing apparatus. Inthe present embodiment, a deterioration degree of an image due to bokehand blur is considered. In other words, the deterioration degree due tobokeh and blur is calculated and held in advance in consideration ofimage capturing conditions. The image capturing conditions include arelative speed between the target object and the image capturingapparatus, or the like. The image capturing conditions also include animage capturing parameter of the image capturing apparatus, such as anexposure time, a focal position, or an aperture.

Features extracted from the captured image are not particularly limited,but an edge feature is used in the present embodiment. The edge featureis a point that is an extreme value of a luminance gradient which isextracted by applying a differential filter such as a Sobel filter tothe captured image. In the present embodiment, to correctly extract theedge feature, a filter in accordance with a bokeh amount and a bluramount is used. In the embodiment, a large size filter is used when thebokeh amount and the blur amount are large, and a small size filter isused when the bokeh amount and the blur amount are small. In addition,in the fitting of an edge feature extracted from the captured image (animage edge feature) and a feature on the three-dimensional model (amodel edge feature), a weight of the edge feature is increased to theextent that the deterioration degree is small.

FIG. 1 shows a configuration of an information processing apparatus 1according to the first embodiment, which is an example of theinformation processing apparatus according to the present invention. Theinformation processing apparatus 1 comprises the deterioration degreecalculation unit 101, the deterioration degree holding unit 102, theimage obtaining unit 103, the feature extraction unit 104, thereliability calculation unit 105, a model holding unit 106, and amatching unit 107.

The deterioration degree calculation unit 101 calculates informationthat indicates an image deterioration degree for each position in thecaptured image captured by the image capturing apparatus. In the presentembodiment, the deterioration degree calculation unit 101 calculates animage deterioration degree at a position of a model edge feature on thecaptured image for each model edge feature.

In the present embodiment, the deterioration degree calculation unit 101calculates the bokeh amount and the blur amount on the captured imagefor each model edge feature as the deterioration degree. In theembodiment, the deterioration degree calculation unit 101 calculates thedeterioration degree by a simulation that considers image capturingconditions according to the image capturing apparatus 108 and thethree-dimensional model that indicates a three-dimensional shape of thetarget object. A detailed method of calculating the deterioration degreeis described later. In the present embodiment, the deterioration degreecalculation unit 101 calculates the bokeh amount and the blur amount asthe deterioration degree. The deterioration degree holding unit 102holds the deterioration degree calculated by the deterioration degreecalculation unit 101 for each model edge feature. The deteriorationdegree is not particularly limited if it indicates the imagedeterioration degree of the captured image. For example, thedeterioration degree may be at least one of the blur amount and thebokeh amount of an image. In addition, a parameter based on the bokehamount and the blur amount (for example, a σ₀ described later) may becalculated as the deterioration degree. Furthermore, three or moreparameters may be calculated as the deterioration degree.

In the present embodiment, the deterioration degree calculation unit 101calculates the deterioration degree for each model edge feature that themodel holding unit 106 holds position information for. The deteriorationdegree holding unit 102 holds the deterioration degree for each of themodel edge features that the model holding unit 106 holds positioninformation for. For any point on the three-dimensional model, it ispossible to calculate the deterioration degree by a similar method.

In the present embodiment, matching of an image of the target object andan image of the comparison target (three-dimensional model image) isperformed as will be explained later. If matching is good, for examplein the final stage of optimization, it can be considered that the imageof the target object is approximate to the image of the comparisontarget. Accordingly, it can be considered that the deterioration degreeof the image at the position of a model edge feature on the capturedimage corresponds to the deterioration degree of the image edge featurein accordance with the captured image.

In the present embodiment, the deterioration degree calculation unit 101detects positions of groups of model edge features for various relativepositions and orientations of the target object with respect to theimage capturing apparatus 108. The deterioration degree calculation unit101 then calculates a deterioration degree for each model edge feature.The deterioration degree holding unit 102 holds a group of deteriorationdegrees calculated for a group of model edge features in associationwith a relative position and orientation of the target object withrespect to the image capturing apparatus 108. Here, the deteriorationdegree calculated for each model edge feature is associated with therelative position and orientation of the model edge feature with respectto the image capturing apparatus 108.

In the present embodiment, if the relative position and orientation ofthe target object with respect to the image capturing apparatus 108 isthe same, it is assumed that the deterioration degree of the capturedimage is also the same. If image capturing conditions of the imagecapturing apparatus 108, such as the focal position and the aperturevalue, are fixed, it is estimated that the bokeh amount of the capturedimage is similar if the relative position and orientation of the targetobject with respect to the image capturing apparatus 108 is fixed.

A case in which when an industrial robot grips the target object, theposition and orientation of the target object are measured by capturingthe target object using the image capturing apparatus 108, which isfixed to the robot is envisioned. Because it is predicted that theindustrial robot repeats a fixed operation, it can be considered thatthe speed of the image capturing apparatus 108 in accordance with theposition and orientation of the image capturing apparatus 108 is fixed.In addition, it can be considered that the target object is stationaryor is moving at a fixed speed riding on a conveyor-belt or the like.Accordingly, if image capturing conditions of the image capturingapparatus 108, such as shutter speed, are fixed, it is estimated thatthe blur amount of the captured image is similar if the relativeposition and orientation of the target object with respect to the imagecapturing apparatus 108 is fixed.

In another embodiment, the deterioration degree calculation unit 101 mayuse various image capturing conditions in accordance with the imagecapturing apparatus 108 to calculate a group of deterioration degrees.In such a case, the deterioration degree holding unit 102 can hold agroup of deterioration degrees in association with image capturingconditions and the relative position and orientation of the targetobject with respect to the image capturing apparatus 108. In such acase, for example, it is possible to estimate the position andorientation of the target object by using a group of deteriorationdegrees that corresponds to a coarse position/orientation of the targetobject and image capturing conditions of the captured image obtained bythe image obtaining unit 103.

The image obtaining unit 103 obtains the captured image which isacquired by capturing the target object. In the present embodiment, theimage obtaining unit 103 obtains from the image capturing apparatus 108the captured image of the target object which is captured by the imagecapturing apparatus 108, which is connected to the informationprocessing apparatus 1. In another embodiment, the image obtaining unit103 may obtain the captured image from a storage unit (not shown) whichthe information processing apparatus 1 comprises and which stores thecaptured image. Furthermore, the image obtaining unit 103 may obtain thecaptured image from an external storage apparatus (not shown) that isconnected to the information processing apparatus 1 via a network andstores the captured image obtained by the image capturing apparatus 108.In addition, a type of the captured image is not particularly limited ifit is possible to extract features of the image of the target object.For example, the captured image may be a gray-scale image, may be acolor image, or may be a range image.

The feature extraction unit 104 extracts features of the image of thetarget object from the captured image obtained by the image obtainingunit 103. In the present embodiment, the feature extraction unit 104extracts an edge feature by performing edge detection processing withrespect to the captured image. In this case, the feature extraction unit104 extracts the edge feature by referring to the deterioration degreethat the deterioration degree holding unit 102 holds. In other words,the feature extraction unit 104 sets an extraction parameter, which isused to extract a feature from the captured image, in accordance withthe deterioration degree. The feature extraction unit 104 refers to thecaptured image and uses the set extraction parameter to extract aplurality of features from the captured image. In this way, the featureextraction unit 104 has a deterioration degree obtainment unit thatobtains a deterioration degree that the deterioration degree holdingunit 102 holds.

Specifically, the feature extraction unit 104 first obtains a group ofdeterioration degrees corresponding to a coarse value of a currentposition and orientation of the target object from the deteriorationdegree holding unit 102. In the present embodiment the featureextraction unit 104 specifies, from positions and orientations of thetarget object associated with groups of deterioration degrees held bythe deterioration degree holding unit 102, the closest to the coarsevalue of the current position and orientation of the target object. Thefeature extraction unit 104 obtains a group of deterioration degreesassociated with the specified position and orientation of the targetobject.

The feature extraction unit 104 sets an extraction parameter forextracting an edge feature based on an obtained deterioration degree. Inthe present embodiment, a filter coefficient of an edge extractionfilter is set as the extraction parameter. Furthermore, the featureextraction unit 104 extracts the edge feature by applying the set filterto the captured image. Below, the edge feature that the featureextraction unit 104 extracts from the captured image is referred to asan image edge feature. A method of setting the filter is describedlater.

A coarse position and orientation of the target object can be obtainedby using a publicly known method. As an example of a method that obtainsa coarse position and orientation of the target object, the methodrecited in Hiroto Yoshii, “Coarse Position/Orientation Detection of BulkParts Using Ensemble Classification Tree”, Image Recognition andUnderstanding Symposium (MIRU2010), 2010 is given. As a coarse positionand orientation of the target object, the position and orientation ofthe target object estimated directly prior may also be used.

The model holding unit 106 holds the three-dimensional model of thetarget object. The three-dimensional model illustrates a threedimensional geometric shape of the target object. The expression formatof the geometric shape is not particularly restricted. For example, thethree-dimensional model may be data of a polygon format, i.e. may have acollection of planes and lines configured by three-dimensional pointsfor expressing the geometric shape. The three-dimensional model may havea collection of three-dimensional lines that express edge lines, or mayhave a collection of simple three-dimensional points. In the presentembodiment, the model holding unit 106 holds the position information ofa three-dimensional edge feature (model edge feature) extracted from thethree-dimensional model of the target object.

An example of a method of extracting the model edge feature that themodel holding unit 106 holds is shown below. The image obtained in thisway is called a projection image. In other words, the image obtained bythe image capturing apparatus 108 capturing the target object isestimated by using the three-dimensional model of the target object. Forexample, this processing can be implemented by arranging thethree-dimensional model and a viewpoint corresponding to the imagecapturing apparatus 108 in a virtual space in accordance with therelative position and orientation of the target object with respect tothe image capturing apparatus 108, and generating an image of the targetobject from the viewpoint.

Next, by applying a differential filter to the obtained projectionimage, an edge feature on the projection image is extracted.Furthermore, by back projecting the edge feature of the projection imageonto the three-dimensional model, the model edge feature is extracted.For example, it is possible to extract a group of points on thethree-dimensional model that corresponds to the edge feature on theprojection image as the model edge feature. By performing the aboveprocessing on various relative positions and orientations of the targetobject with respect to the image capturing apparatus 108, one or moregroups of model edge features are extracted in each relative positionand orientation. The model holding unit 106 holds position informationof the extracted one or more groups of model edge features. In thepresent embodiment, the model holding unit 106 holds the positioninformation of the extracted one or more groups of model edge featuresin association with the relative position and orientation of the targetobject with respect to the image capturing apparatus 108. In such acase, the position information of the model edge feature may indicatethe relative position and orientation of the model edge feature withrespect to the image capturing apparatus 108.

The method of extracting the model edge feature from thethree-dimensional model is not limited to the above described method,and using another method is possible. For example, if thethree-dimensional model holds normal information of a surface, alocation for which a normal direction is non-consecutive can beextracted as the model edge feature. The model holding unit 106 maycollectively hold groups of model edge features extracted in variousrelative positions and orientations. For example, the model holding unit106 may hold model edge feature position information in thethree-dimensional model. In such a case, it is possible to calculate therelative position and orientation of a model edge feature with respectto the image capturing apparatus 108, based on the relative position andorientation of the three-dimensional model with respect to the imagecapturing apparatus 108.

Model information that the model holding unit 106 holds may be thethree-dimensional model itself. Furthermore, the model information maybe information of a direction and position on the projection image of atwo-dimensional edge feature image obtained by projecting athree-dimensional model edge feature onto a captured image surface ofthe image capturing apparatus 108. Even in such a case, it is possibleto cause the feature indicated by the model information to match withthe feature extracted from the image. Furthermore, the model informationmay be an image obtained by capturing the target object. In such a case,a feature extracted from this image can be caused to be matched to afeature extracted by the feature extraction unit 104. In addition, it isalso possible to cause matching between images in accordance with alater-described fifth embodiment. Furthermore, the model information mayhave an identifier that specifies the type of the target object.

The matching unit 107 calculates the position and orientation of thetarget object. The matching unit 107 performs association of a featureof the target object and a feature of the three-dimensional modelobserved when the three-dimensional model is arranged in accordance witha predetermined position and orientation. The position and orientationof the target object with respect to the image capturing apparatus 108are then derived by correcting a predetermined position and orientationbased on a result of the associating and a deterioration degreecorresponding to the position for which the feature was extracted.

An overview of the processing performed by the matching unit 107 is asbelow. Firstly, the matching unit 107 calculates a difference for eachof a plurality of features extracted from the image of the targetobject, between the respective feature and a feature of an image of acomparison target corresponding to the feature. In the presentembodiment, the matching unit 107 calculates a difference, for each of aplurality of edge features extracted from the image of the targetobject, between the respective edge feature and an edge feature of theimage of the comparison target that corresponds to the edge feature.Here, the image of the comparison target is a three-dimensional modelimage observed when a viewpoint and a three-dimensional model arearranged in accordance with a predetermined position and orientation. Inthe present embodiment, first the difference is calculated by using thethree-dimensional model image when the viewpoint and thethree-dimensional model are arranged in accordance with the coarseposition/orientation of the target object obtained as described above.The difference is similarly calculated by using a plurality ofthree-dimensional model images obtained while changing the relativeposition and orientation between the viewpoint and the three-dimensionalmodel.

As a specific example, the matching unit 107 obtains a group of modeledge features corresponding to the current relative position andorientation of the target object with respect to the image capturingapparatus 108, from the groups of model edge features that the modelholding unit 106 holds. For example, the matching unit 107 can detect,from the relative positions and orientations associated with the groupsof model edge features that the model holding unit 106 holds, somethingclosest to the coarse value of the current relative position andorientation of the target object with respect to the image capturingapparatus 108. The matching unit 107 then obtains the group of modeledge features associated with the detected relative position andorientation. For each of the groups of model edge features thusobtained, the relative position and orientation of each edge withrespect to the viewpoint—i.e. the image capturing apparatus 108—isspecified.

The matching unit 107 next associates corresponding image edge featuresextracted by the feature extraction unit 104 for each of the obtainedgroups of model edge features. The method of the associating is notparticularly limited, and associating can be performed by the followingmethod, for example. Firstly, for each of a plurality of edges that thethree-dimensional model has, the matching unit 107 calculates an imageposition obtained by projecting on the projection image based on apredetermined position and orientation. For example, the matching unit107 calculates the image position on the projection image observed whenthe viewpoint and the three-dimensional model are arranged in accordancewith a predetermined position and orientation for each of the groups ofmodel edge features. The projection image includes the three-dimensionalmodel image observed when the viewpoint and the three-dimensional modelare arranged in accordance with the predetermined position andorientation, and this three-dimensional model image is configured byimages of a plurality of model edge features. Image edge features arethen associated with model edge features, so that the image positions ofthe model edge features on the projection image approach the imagepositions of the image edge features on the captured image. An exampleof such a method of associating is explained later. It is sufficient touse information embedded in the target object, or the like, to identifyan image edge feature corresponding to a model edge feature.

In a case where the model holding unit 106 holds the three-dimensionalmodel as model information, the matching unit 107 obtains thethree-dimensional model from the model holding unit 106. Next, thematching unit 107 projects the obtained three-dimensional model on acaptured image surface of the image capturing apparatus 108 based on acoarse value of the current relative position and orientation of thetarget object with respect to the image capturing apparatus 108. Thus,the image of the three-dimensional model observed when a viewpoint and athree-dimensional model are arranged in accordance with a predeterminedposition and orientation is obtained. Furthermore, the matching unit 107extracts model edge features from the projection image by using a methodsimilar to the method of extracting model edge features that the modelholding unit 106 holds. The matching unit 107 associates extracted modeledge features with image edge features through the above describedmethod.

The matching unit 107 obtains a reliability w that is applied to animage edge feature from the reliability calculation unit 105. Thereliability w indicates a probability that an image edge feature iscorrectly extracted from the captured image, for example a probabilitythat it is extracted at a correct position. In the present embodiment,the reliability w applied to an image edge feature is obtained inaccordance with a deterioration degree for a corresponding model edgefeature. As described above, because the deterioration degree expressesa degree of deterioration of the image for an image edge featurecorresponding to a model edge feature on the captured image, it ispossible to calculate the reliability w in accordance with thedeterioration degree. In other embodiments, the reliability calculationunit 105 can calculate the reliability w in accordance with adeterioration degree corresponding to an image position at which theimage edge feature was extracted.

The matching unit 107 then refers to a groups of model edge features, agroup of image edge features, and the reliability w, and calculates aposition and orientation of the target object. In the presentembodiment, the matching unit 107 determines that a three-dimensionalmodel image for which a degree of matching is highest with respect tothe image of the target object, from a plurality of three-dimensionalmodel images, as the image of the comparison target with respect to theimage of the target object. Here, the degree of matching corresponds toa value obtained based on differences between corresponding model edgefeatures and a plurality of image edge features, which are weighted inaccordance with the deterioration degree corresponding to the positionat which the image edge feature was extracted.

In the present embodiment, the differences between the image edgefeatures and the model edge features are distances between the imagepositions of the image edge features on the captured image and the imagepositions of the model edge features on the projection image. Theweighting uses the reliability w calculated in accordance with thedeterioration degree. Specifically, a large weighting is performed tothe extent that the probability that the image of the image edge featureis correctly extracted is high.

In the present embodiment, the matching unit 107 uses the coarseposition/orientation of the target object as an initial position andorientation between the viewpoint and the three-dimensional model. Thematching unit 107 then performs optimization of the position andorientation between the viewpoint and the three-dimensional model sothat a degree of matching becomes higher. When the degree of matchingbecomes highest, the matching unit 107 determines that the image of thethree-dimensional model matches the image of the target object, i.e.that the position and orientation of the three-dimensional model withrespect to the viewpoint represents the position and orientation of thetarget object with respect to the image capturing apparatus 108. Detailsof this optimization calculation are also explained later.

The reliability calculation unit 105 calculates the reliability w for animage edge feature that corresponds to a model edge feature, based onthe deterioration degree for the model edge feature. In the presentembodiment, calculation of the reliability w is performed so that thereliability w becomes low to the extent that the deterioration degreefor the model edge feature is high.

An example of specific processing is described below. Firstly, thereliability calculation unit 105, for each model edge feature for whicha corresponding image edge feature has been detected, performs obtainingof information of a deterioration degree σ₀, which is stored inassociation with the position and orientation of the target objectcorresponding to the image capturing apparatus, from the deteriorationdegree holding unit 102. In this way, the reliability calculation unit105 has a deterioration degree obtainment unit that obtains thedeterioration degree that the deterioration degree holding unit 102holds. The reliability calculation unit 105 then calculates thereliability w for the image edge feature based on the obtaineddeterioration degree σ₀.

The reliability w is not particularly limited if it is a value definedto be high to the extent that the probability that the image edgefeature is correctly extracted is high. In an embodiment where at leastone of the bokeh amount and the blur amount is used as the deteriorationdegree, the reliability w is defined so that the reliability w becomeslow to the extent that the bokeh amount is large or to the extent thatthe blur amount is large. In the present embodiment, the reliability wis defined by using a function in which a statistic of the deteriorationdegree σ₀ obtained in accordance with the bokeh amount and the bluramount is included as a parameter. As the statistic, an average value, avariance value, or the like of the deterioration degree is given, butthe statistic is not particularly limited. For example, the reliabilityw can be expressed by using a Tukey function shown in Equation (1).

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 1} \right\rbrack & \; \\{{w(b)} = \left\{ \begin{matrix}\left( {1 - \left( {b/c} \right)^{2}} \right)^{2} & {{b} \leq c} \\0 & {{b} > c}\end{matrix} \right.} & (1)\end{matrix}$

In Equation (1), c is an average value of the deterioration degree σ₀for the model edge features for which a corresponding image edge featureis detected—in other words a statistic—and b is the deterioration degreeσ₀ of the respective model edge feature. Note that a type of thefunction is not particularly limited, and another function, in which thereliability w becomes low to the extent that the deterioration degree ishigh and the reliability w becomes high to the extent that thedeterioration degree is low, can be used. For example, reliability maybe expressed by a Gaussian function, a Huber function, or the like. Inaddition, the reliability w may be expressed by a function, such as aTukey function, in which an allowable value of the deterioration degreethat is set in advance is used instead of the statistic c.

<Processing According to the Present Embodiment>

Next, an example of processing in the present embodiment is explainedwith reference to the flowchart of FIG. 2. In step S201, thedeterioration degree holding unit 102 determines whether a deteriorationdegree is already held. If the deterioration degree is held, theprocessing proceeds to step S204. If the deterioration degree is notheld, the processing proceeds to step S202. In step S202, thedeterioration degree calculation unit 101 calculates the deteriorationdegree. A method of calculating the deterioration degree is describedlater. In step S203, the deterioration degree holding unit 102 holds thedeterioration degree calculated in step S202. In this way, in thepresent embodiment, calculation and holding of the deterioration degreeis performed as initial processing of the information processingapparatus 1 in step S201 and step S202 before the image obtaining unit103 obtains the captured image. The deterioration degree holding unit102 may obtain and hold a deterioration degree that is calculated inadvance from an external apparatus, a storage medium, or the like.

In step S204, the image obtaining unit 103 obtains the captured imageobtained by the image capturing apparatus 108 capturing the targetobject. In step S205, the feature extraction unit 104 extracts an imageedge feature from the captured image obtained in step S204. The featureextraction unit 104 sets a filter based on the deterioration degreeobtained from the deterioration degree holding unit 102, and extractsthe image edge feature by applying the filter to the captured image.Specific processing in step S205 is described later. In step S206, thematching unit 107 calculates the position and orientation of the targetobject. Specific processing is described later.

<Deterioration Degree Calculation Processing>

Next, explanation is given with reference to the flowchart of FIG. 3 forprocessing of step S202, in which the deterioration degree calculationunit 101 calculates the deterioration degree. In the present embodiment,the deterioration degree calculation unit 101 arranges the viewpoint andthe three-dimensional model of the target object in accordance withvarious positions and orientations of the target object with respect tothe image capturing apparatus 108. Considering image capturingconditions according to the image capturing apparatus 108, thedeterioration degree of the image of the three-dimensional model seenfrom the viewpoint is calculated. The deterioration degree thus obtainedcan be used as an estimated value of the deterioration degree of theimage of the target object obtained when the image capturing apparatus108 captures the target object.

In the present embodiment, a bokeh amount D, a blur amount B, and adeterioration degree σ₀ obtained in accordance with the bokeh amount andthe blur amount are calculated as the deterioration degree. In addition,the image capturing conditions for the target object in accordance withthe image capturing apparatus 108 include the relative speed between thetarget object and the image capturing apparatus 108, as well as imagecapturing parameters, such as the exposure time, the focal position, andthe aperture of the image capturing apparatus 108, or the like.Explanation is given below of a method of calculating the deteriorationdegree in accordance with a simulation that uses these pieces ofinformation.

Explanation is given below for a method of calculating the deteriorationdegree for each model edge feature that the model holding unit 106 holdsposition information for. In another embodiment, the deteriorationdegree calculation unit 101 detects a position of a model edge featurefor various relative positions and orientations of the target objectwith respect to the image capturing apparatus 108, and calculates thedeterioration degree for each of the model edge features. In such acase, the deterioration degree calculation unit 101 can detect the modeledge feature using the above described method as a method to extract themodel edge feature that the model holding unit 106 holds. The method ofcalculating the deterioration degree for the detected model edge featurecan be performed as below.

In step S301, the deterioration degree calculation unit 101 obtainsimage capturing conditions for when the image capturing apparatus 108captures the target object. The image capturing conditions include animage capturing parameter that has an effect on the captured image inaccordance with the image capturing apparatus 108: for example the focaldistance, focus position, the aperture value, the exposure time, or thelike. The image capturing conditions include the relative speed betweenthe image capturing apparatus 108 and the target object. The relativespeed expresses a relative movement direction and speed between thetarget object and the image capturing apparatus 108.

For example, when the target object performs a translational motion inone axial direction on the conveyor-belt, it is possible to calculate amovement speed and a movement direction of the target object based onsetting data, a setting value, or the like. The movement direction andmovement speed of the target object may be detected by using a sensor orthe like. The relative speed between the image capturing apparatus 108and the target object can be calculated based on the movement directionand the movement speed of the target object, as well as the relativeposition and orientation of the target object with respect to the imagecapturing apparatus 108 associated with the model edge feature.

In step S302, the deterioration degree calculation unit 101 selects onemodel edge feature, for which the deterioration degree is calculated,from a group of model edge features that the model holding unit 106holds position information for. In step S303, the deterioration degreecalculation unit 101 calculates a predicted bokeh amount D for the modeledge feature selected in step S302. The method for calculating the bokehamount is not particularly limited; it can be calculated by using apublicly known formula. The calculation of the bokeh amount can beperformed by using an image capturing parameter such as the focaldistance of the image capturing apparatus 108, the aperture value of theimage capturing apparatus 108 and the distance between the imagecapturing apparatus 108 and the focal plane, as well as a distancebetween the image capturing apparatus 108 and the model edge feature, orthe like. In the present embodiment, the deterioration degreecalculation unit 101 uses Equation (2) below to calculate the bokehamount D.

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 2} \right\rbrack & \; \\{D = \frac{f^{2}\left( {L_{o} - L_{n}} \right)}{{{FL}_{n}\left( {L_{o} - f} \right)}\Delta \; d}} & (2)\end{matrix}$

In Equation (2), f represents the focal distance of an imaging lens ofthe image capturing apparatus 108. L₀ represents a focus position of avirtual viewpoint. L_(n) represents a distance from the virtualviewpoint to the model edge feature. F represents an aperture value ofthe imaging lens of the image capturing apparatus 108. Ad represents thesize of a pixel. L_(n) can be calculated based on position informationof the model edge feature. An image capturing parameter of the imagecapturing apparatus 108, such as the aperture value or focal distance ofthe imaging lens, may be set in advance in accordance with aspecification of the image capturing apparatus 108, or may be obtainedfrom the image capturing apparatus 108 by the deterioration degreecalculation unit 101. In addition, an image capturing parameter may becalibrated in advance in accordance with, for example, a methoddisclosed in R. Y. Tsai, “A Versatile Camera Calibration Technique forHigh-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Camerasand Lenses”, IEEE Journal of Robotics and Automation, vol. RA-3, no. 4,1987. Image capturing parameters may further include a lens distortionparameter or the like, and the bokeh amount may be calculated inreference to such a further parameter.

In step S304, the deterioration degree calculation unit 101 calculates apredicted blur amount for the model edge feature selected in step S302.The method for calculating the blur amount is not particularly limited,and can be calculated by using a publicly known formula. Below,explanation is given for an example of a method for calculating the bluramount. In the present embodiment the deterioration degree calculationunit 101 calculates the movement amount on the captured image of themodel edge feature during the exposure time as the blur amount. The bluramount can be calculated based on the relative position and orientationof the model edge feature with respect to the image capturing apparatus108, and image capturing conditions according to the image capturingapparatus 108, such as the exposure time and the relative speed betweenthe image capturing apparatus 108 and the target object. In thefollowing example, the movement amount of the model edge feature on thecaptured image during the exposure time is calculated as the bluramount.

In the present embodiment, the deterioration degree calculation unit 101calculates a Jacobian of a model edge feature on the captured image. Thedeterioration degree calculation unit 101 then calculates the bluramount of a model edge feature based on the Jacobian of the model edgefeature, the relative speed between the target object and the imagecapturing apparatus 108, as well as the exposure time.

The Jacobian of the model edge feature is a value that represents aratio at which the position of the image of the model edge featurechanges on the captured image when a position and orientation parameterof six-degrees-of-freedom changes with respect to the target objectslightly.

Below, the position and orientation of the target object is representedas s, the position of the image of the model edge feature when exposurestarts is represented as (u, v), the position of the image of the modeledge feature when exposure ends is represented as (u′, v′), and a normaldirection (unit vector) of the image of the model edge feature isrepresented as (n_(u), n_(v)). Thereby, a signed distance err_(2D)between the image of the model edge feature when exposure starts and theimage of the model edge feature when the exposure ends can be calculatedby the following Equation (3).

[EQUATION 3]

err _(2D) =n _(u)(u′−u)+n _(v)(v′−v)  (3)

The position and orientation s of the target object is a six-dimensionalvector, and has three elements (s₁, s₂, s₃) that express the position ofthe target object and three elements (s₄, s₅, s₆) that express theorientation of the target object. The method of expressing orientationby the three elements is not particularly limited. For example, theorientation can be expressed by Euler angles. In addition, it ispossible to express the orientation by a three-dimensional vector, forwhich a normal of the vector expressed a rotation angle and a directionof the vector represents a rotation axis that passes through the originpoint. By partially differentiating the distance between correspondentelements err_(2D) by each element of the position and orientation s, itis possible to calculate a Jacobian matrix J_(2D) of a model edgefeature, as in the following Equation (4).

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 2} \right\rbrack & \; \\{J_{2\; D} = \left\lbrack {\frac{\partial{err}_{2\; D}}{\partial s_{1}}\mspace{14mu} \frac{\partial{err}_{2\; D}}{\partial s_{2}}\mspace{14mu} \frac{\partial{err}_{2\; D}}{\partial s_{3}}\mspace{14mu} \frac{\partial{err}_{2\; D}}{\partial s_{4}}\mspace{14mu} \frac{\partial{err}_{2\; D}}{\partial s_{5}}\mspace{14mu} \frac{\partial{err}_{2\; D}}{\partial s_{6}}} \right\rbrack} & (4)\end{matrix}$

By performing the above processing on each model edge feature, thedeterioration degree calculation unit 101 calculates the Jacobian ofeach model edge feature. Accordingly, the blur amount B of the modeledge feature that occurs in accordance with the target object moving atthe relative speed V during an exposure time t_(i) of the image can becalculated in accordance with the following Equation (5) by using theJacobian of the model edge feature.

[EQUATION 5]

B=t _(i) J _(2D) V  (5)

The obtained the blur amount B is a scalar, and represents a movementamount of the position of the image of the model edge feature on thecaptured image during the exposure time. By performing the aboveprocessing on each model edge feature, the deterioration degreecalculation unit 101 calculates the blur amount B for each model edgefeature.

In step S305, the deterioration degree calculation unit 101 calculatesthe deterioration degree σ₀ by using the blur amount B obtained in stepS304 and the bokeh amount D obtained in step S303, for each model edgefeature. The specific method of calculating the deterioration degree σ₀is not particularly limited. In the present embodiment, thedeterioration degree calculation unit 101 calculates the deteriorationdegree σ₀ so that the deterioration degree σ₀ becomes large to theextent that the bokeh amount D and the blur amount B are large. Forexample, the deterioration degree σ₀ can be calculated by using Equation(6). The deterioration degree σ₀ can be defined by any method if itbecomes big to the extent that the deterioration degree of the image ishigh.

[EQUATION 6]

σ₀=√{square root over (D ² +B ²)}  (6)

In step S306, the deterioration degree calculation unit 101 determineswhether calculation has been performed for the deterioration degree—inother words the bokeh amount D and the blur amount B—for all edgefeatures. When the calculation of the deterioration degree is finished,processing terminates. If calculation of the deterioration degree hasnot finished, the processing returns to step S302. The deteriorationdegree holding unit 102 holds the thus calculated deteriorationdegrees—in other words the bokeh amount D and the blur amount B—inassociation with model edge features.

<Filter Setting and Edge Extraction Processing>

Next, the flowchart of FIG. 4 is used to give an explanation of detailedprocessing in step S205 that extracts an image edge feature from acaptured image. In the present embodiment, the feature extraction unit104 sets the filter in accordance with the deterioration degree of thecaptured image. The feature extraction unit 104 then extracts the imageedge feature by applying the set filter to the obtained image. In thepresent embodiment, one filter is set in accordance with the positionand orientation of the target object, and the image edge feature isextracted by applying this one filter to the captured image on thewhole.

In step S401, the feature extraction unit 104 obtains the captured imageof the target object that the image obtaining unit 103 obtained. In stepS402, the feature extraction unit 104 obtains from the deteriorationdegree holding unit 102 a group of deterioration degrees correspondingto the position and orientation of the target object. In the presentembodiment, the bokeh amount D and the blur amount B are obtained as thedeterioration degree. The bokeh amount D and the blur amount B are heldfor each model edge feature, and in the following processing, statisticsof the bokeh amount D and the blur amount B, for example average valuesfor the plurality of model edge features, are used.

In step S403, the feature extraction unit 104 sets an extractionparameter for extracting the edge feature. In the present embodiment,the feature extraction unit 104 sets a filter for extracting the imageedge feature from the captured image. In the present embodiment, thefeature extraction unit 104 calculates a predicted waveform of an edgefeature with reference to the bokeh amount D and the blur amount Bobtained in step S402. In the present embodiment, specifically, thefeature extraction unit 104 calculates the waveform of the image of thepredicted model edge feature by performing a convolution calculation ofa Gaussian function of a standard deviation D and a rectangular wave forwhich the width is B and the height is 1, as shown in FIGS. 5A to 5F.

FIGS. 5A and 5D show waveforms 501, 504 calculated by using the bokehamount D and the blur amount B. Next, as shown in FIGS. 5B and 5E, thefeature extraction unit 104 calculates waveforms 502, 505 bydifferentiating the obtained waveforms 501, 504. The waveforms 502, 505thus obtained are extraction filters corresponding to the waveforms 501,504, and edges can be extracted from the waveforms 501, 504 with highprecision. Finally, the feature extraction unit 104 obtains theextraction filters 503, 506 by using a predetermined threshold toquantize the waveforms 502, 505. Thus, as shown in FIGS. 5C and 5F, theextraction filters 503, 506—which are used for convolution calculationwith respect to the captured image—are set. The extraction filters 503,506 thus obtained are equivalent to differential filters.

The extraction filter 503 is used when the bokeh amount D and the bluramount B are larger, as shown in the waveform 501. Also, the extractionfilter 506 is used when the bokeh amount D and the blur amount B aresmaller, as shown in the waveform 504. In this way, in the presentembodiment, a large size filter is used to the extent that thedeterioration degree (the bokeh amount D and the blur amount B) islarge, and a small size filter is used to the extent that thedeterioration degree is small. In addition, instead of the bokeh amountD and the blur amount B, the deterioration degree σ₀ calculated asdescription above can be used. In such a case, a large size filter isused to the extent that the deterioration degree σ₀ is large, and asmall size filter is used to the extent that the deterioration degree issmall.

Here, explanation is given for when a one-dimensional filter is used asthe extraction filter. However, it is possible to use a two-dimensionalfilter as the extraction filter. In such a case, it is possible to setsuch that a large size filter is used to the extent that thedeterioration degree is large, and a small size filter is used to theextent that the deterioration degree is small respectively.

In step S404, the feature extraction unit 104 extracts an edge feature(image edge feature) by applying the filter set in step S403 to thecaptured image. In the present embodiment, a map that indicates edgeintensity is obtained by applying the filter to the captured image.

<Processing to Estimate Position and Orientation>

Next, explanation is given with reference to the flowchart of FIG. 6 fora method of estimating the position and orientation of the target objectin step S206. In the present embodiment, the matching unit 107 estimatesthe position and orientation of the target object by optimizing theposition and orientation of the three-dimensional model so that imageedge features and images of model edge features projected on thecaptured image fit. Here, the matching unit 107 uses the Gauss-Newtonmethod to optimize the position and orientation of the three-dimensionalmodel so that a sum total of distances on the captured image betweenimages of model edge features and image edge features is minimized.

In step S601, the matching unit 107 performs initialization processing.Firstly, the matching unit 107 obtains a coarse position/orientation ofthe target object, and sets it as the position and orientation s of thethree-dimensional model. As described above, the matching unit 107obtains from the model holding unit 106 the group of model edge featurescorresponding to the coarse relative position and orientation of thetarget object with respect to the image capturing apparatus 108.

In step S602, the matching unit 107 associates each of the obtainedgroups of model edge features and each of the groups of image edgefeatures extracted by the feature extraction unit 104. The method of theassociating is not particularly limited, and associating can beperformed as follows, for example. Firstly, the matching unit 107projects model edge features on the captured image surface of the imagecapturing apparatus 108 based on a coarse value of the current relativeposition and orientation of the target object with respect to the imagecapturing apparatus 108. Thus the matching unit 107 calculates theposition and direction of the model edge feature in the captured imagesurface of the image capturing apparatus 108. FIG. 7A shows theprojected model edge feature 701.

Next, the matching unit 107 sets a plurality of control points 702 onthe projected model edge feature 701 so that they are evenly spacedapart on the captured image surface. Furthermore, the matching unit 107sets, for each control point 702, a search line 703 in the normaldirection of the projected model edge feature 701. The matching unit 107then, for each control point 702, searches for an image edge featurepresent on the search line 703 in a predetermined range from the controlpoint 702.

In FIG. 7B, the origin point is the control point 702, the abscissa axisis the search line, and the ordinate axis shows the absolute value of apixel value of an edge intensity map. In the present embodiment, theedge intensity map is obtained by the feature extraction unit 104applying the differential filter to the captured image in step S404. Inother words, the absolute value of the pixel value of the edge intensitymap corresponds to the absolute value of a luminance gradient of thecaptured image. The matching unit 107 detects, as the image edgefeature, a point for which the absolute value of the luminance gradientis an extreme value. In the present embodiment, the matching unit 107detects as the corresponding point a point for which the absolute valueof the luminance gradient is an extreme value and is greater than apredetermined threshold 705.

FIG. 7B shows corresponding points 704, 706 thus found. From the foundcorresponding points 704, 706, the matching unit 107 records thecorresponding point 704, which is closest to the control point 702 as acorresponding point that corresponds to the control point 702. By thisprocessing, the corresponding point for each of the plurality of controlpoints 702 is detected. The corresponding point 704 thus detected is apoint on an image edge feature 800 corresponding to the model edgefeature 701.

Below, processing that uses one control point for each model edgefeature is performed. From a plurality of control points set for onemodel edge feature, it is possible for the matching unit 107 to selectand use in later processing one control point for which a correspondingpoint is detected. A method of selecting is not particularly limited. Acorresponding image edge feature may not be detected for several modeledge features.

In steps S603 to S605, the matching unit 107 calculates the position andorientation of the target object as described above. In an embodiment,the matching unit 107 can calculate the position and orientation of thetarget object as follows. For example, the matching unit 107 calculatesa degree of matching with the image of the target object for a pluralityof three-dimensional model images obtained by changing the position andorientation. Here, the degree of matching is obtained based on distancesbetween image positions of corresponding model edge features and imagepositions of a plurality of image edge features, which are weighted inaccordance with the deterioration degree corresponding to the positionat which the image edge feature was extracted. The weighting isperformed so that the weight becomes small to the extent that thedeterioration degree is high.

In one example, the matching unit 107 uses as an evaluation function asum total of weighted distances obtained by multiplying the reliabilityw as the weight with the distance on the captured image between thecorresponding model edge feature and the image edge feature. Theevaluation function represents a degree of matching between thethree-dimensional model image and the image of the target object.

Then the matching unit 107 determines, from a plurality ofthree-dimensional model images, the three-dimensional model image thatprovides the highest degree of matching as the three-dimensional modelimage corresponding to the image of the target object. The matching unit107 determines that the position and orientation of thethree-dimensional model with respect to the viewpoint that correspondsto the three-dimensional model image thus determined represent theposition and orientation of the target object with respect to the imagecapturing apparatus 108.

However, in the present embodiment, position and orientation of thetarget object are calculated by using the Gauss-Newton method asfollows. By virtue of the following method, the position and orientationof the three-dimensional model with respect to the viewpoint arerepeatedly updated so that the degree of matching between the image ofthe target object and the image of the three-dimensional model becomeslarge.

In step S603, the matching unit 107 calculates a coefficient matrix Jand an error vector E, as follows. Each element of the coefficientmatrix J is a one-dimensional partial differential coefficientcorresponding to a slight change of the position and orientation of thethree-dimensional model for the image coordinates of the model edgefeature. The error vector E is the distance on the captured imagebetween the image edge feature and the model edge feature projected onthe captured image.

FIG. 8 shows a relation between the model edge feature 701 projected onthe captured image and the image edge feature 800 detected in step S602that corresponds to the model edge feature 701. In FIG. 8, a u axis anda v axis respectively correspond to a horizontal direction and avertical direction of the image. The position of the control point 702on the captured image is expressed as (u₀, v₀), and a tilt with respectto the u axis on the captured image of the model edge feature 701, towhich the control point 702 belongs, is represented as θ. In thefollowing explanation, the tilt at the control point 702 of the modeledge feature 701 is θ. In such a case, the normal vector of the modeledge feature 701 and in particular the normal vector at the controlpoint 702 is (sin θ, −cos θ). In addition, the image coordinates of thecorresponding point 704 that corresponds to the control point 702 are(u′, v′).

A point (u, v) on a straight line that passes through the point (u′, v′)and for which a tilt is θ can be expressed as:

[EQUATION 7]

u sin θ−v cos θ=d  (7)

d=u′ sin θ−v′ cos θ

In Equation (7), θ is the constant described above, and d is a constantindicating the above equation.

The position of the control point 702 on the captured image changes inaccordance with the position and orientation of the three-dimensionalmodel. As described above, the degrees of freedom of the position andorientation of the three-dimensional model is six-degrees-of-freedom,and the position and orientation s of the three-dimensional model isrepresented by a six-dimensional vector. The image coordinates (u, v) ofthe point on the model edge feature 701 corresponding to the controlpoint 702 after the position and orientation of the three-dimensionalmodel has changed can be approximated as in Equation (8) by using afirst order Taylor expansion in the neighborhood of (u₀, v₀). InEquation (8), Δs_(i) (i=1, 2, . . . , 6) represents slight changes toeach component of the six-dimensional vector s.

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 8} \right\rbrack & \; \\{{u \approx {u_{0} + {\sum\limits_{i = 1}^{6}\; {\frac{\partial u}{\partial s_{i}}\Delta \; s_{i}}}}}{v \approx {v_{0} + {\sum\limits_{i = 1}^{6}\; {\frac{\partial v}{\partial s_{i}}\Delta \; s_{i}}}}}} & (8)\end{matrix}$

If the position and orientation of the three-dimensional model ischanged so as to match the position and orientation of the targetobject, the image coordinates of a point on the model edge feature 701corresponding to the control point 702 can be assumed to move on theimage edge feature 800, i.e. a straight line represented by Equation(7). Equation (9) can be obtained by substituting the (u, v)approximated by Equation (8) into Equation (7). In Equation (9), r is aconstant that indicates the second equation.

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 9} \right\rbrack & \; \\{{{{\sin \; \theta {\sum\limits_{i = 1}^{6}\; {\frac{\partial u}{\partial s_{i}}\Delta \; s_{i}}}} - {\cos \; \theta {\sum\limits_{i = 1}^{6}\; {\frac{\partial v}{\partial s_{i}}\Delta \; s_{i}}}}} = {d - r}}{r = {{u_{0}\sin \; \theta} - {v_{0}\cos \; \theta}}}} & (9)\end{matrix}$

Equation (9) holds true for all model edge features for which acorresponding image edge feature is detected in step S602. Accordingly,a linear simultaneous equation for Δs_(i) of Equation (10) holds true.

$\begin{matrix}{\mspace{79mu} \left\lbrack {{EQUATION}\mspace{14mu} 10} \right\rbrack} & \; \\{\begin{bmatrix}{{\sin \; \theta_{1}\frac{\partial u_{1}}{\partial s_{1}}} - {\cos \; \theta_{1}\frac{\partial v_{1}}{\partial s_{1}}}} & {{\sin \; \theta_{1}\frac{\partial u_{1}}{\partial s_{2}}} - {\cos \; \theta_{1}\frac{\partial v_{1}}{\partial s_{2}}}} & \ldots & \begin{matrix}{{\sin \; \theta_{1}\frac{\partial u_{1}}{\partial s_{6}}} -} \\{\cos \; \theta_{1}\frac{\partial v_{1}}{\partial s_{6}}}\end{matrix} \\{{\sin \; \theta_{2}\frac{\partial u_{2}}{\partial s_{1}}} - {\cos \; \theta_{2}\frac{\partial v_{2}}{\partial s_{1}}}} & {{\sin \; \theta_{2}\frac{\partial u_{2}}{\partial s_{2}}} - {\cos \; \theta_{2}\frac{\partial v_{2}}{\partial s_{2}}}} & \ldots & \begin{matrix}{{\sin \; \theta_{2}\frac{\partial u_{2}}{\partial s_{6}}} -} \\{\cos \; \theta_{2}\frac{\partial v_{2}}{\partial s_{6}}}\end{matrix} \\\vdots & \vdots & \ddots & \vdots\end{bmatrix}{\quad{\begin{bmatrix}{\Delta \; s_{1}} \\{\Delta \; s_{2}} \\{\Delta \; s_{3}} \\{\Delta \; s_{4}} \\{\Delta \; s_{5}} \\{\Delta \; s_{6}}\end{bmatrix} = \begin{bmatrix}{d_{1} - r_{1}} \\{d_{2} - r_{2}} \\\vdots\end{bmatrix}}}} & (10)\end{matrix}$

Here, Equation (10) is expressed as in Equation (11).

[EQUATION 11]

JΔs=E  (11)

Calculation of the partial differential coefficients for calculating thecoefficient matrix J of the linear simultaneous equation of Equation(11) can be performed by using a publicly known method. For example, thepartial differential coefficients can be calculated by the methoddisclosed in V Lepetit and P. Fua, “Keypoint recognition usingrandomized trees”, IEEE Transactions on Pattern Analysis and MachineIntelligence, vol. 28, no. 9, 2006.

In step S604, the reliability calculation unit 105 calculates thereliability w for an image edge feature associated with a model edgefeature. Calculation of the reliability w is performed as describedabove. The matching unit 107 then obtains the reliability w calculatedby the reliability calculation unit 105. When calculating thereliability w, it is possible to use the deterioration degreecorresponding to the coarse position/orientation obtained in step S601.In contrast, when calculating the reliability w, a deterioration degreecorresponding to the current position and orientation s of thethree-dimensional model may be used.

The reliability w may be calculated in advance for all model edgefeatures, and this reliability w, for example, may be held by thedeterioration degree holding unit 102. In such a case, the matching unit107 can obtain the reliability for the corresponding model edge featureas the reliability w for the image edge feature from the deteriorationdegree holding unit 102.

The reliability w for each image edge feature is used as the weight foreach image edge feature in the calculation of the correction value Δs instep S605. In the following explanation, a weight matrix W shown inEquation (12), in which a corresponding image edge feature hasreliability w for the detected for the image edge feature as acoefficient, is used.

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 12} \right\rbrack & \; \\{W = \begin{bmatrix}w_{1} & \; & \; & 0 \\\; & w_{2} & \; & \; \\\; & \; & \ddots & \; \\0 & \; & \; & w_{N_{c}}\end{bmatrix}} & (12)\end{matrix}$

The weight matrix W is a square matrix for which components other thandiagonal elements are 0. The diagonal element is a reliability w_(i)(i=1 to N_(c)) of each edge feature, and is used as the weight. N_(c) isa total number of image edge features associated with the model edgefeature.

Equation (11) is transformed by using the weight matrix W as in Equation(13).

[EQUATION 13]

WJΔs=WE  (13)

In step S605, the matching unit 107 calculates the correction value Δsfor the position and orientation of the three-dimensional model. In thepresent embodiment, the matching unit 107 calculates the correctionvalue Δs by solving Equation (13). More specifically, Equation (13) canbe solved as shown in Equation (14).

[EQUATION 14]

Δs=(J ^(T) WJ)⁻¹ J ^(T) WE  (14)

Solving Equation (13) as shown in Equation (14) corresponds to obtainingthe difference for each row of Equation (13), i.e. obtaining the Δs1 toΔs6 that minimize the sum of squares of w_(i)F_(i). Here, F_(i)=(sinθ_(i)(∂u_(i)/∂s₁)−cos θ_(i)(∂vi/∂s₁))Δs+ . . . −(d_(i)−r_(i)). Here,F_(i) is calculated for each model edge feature, and can be consideredas an evaluation value that illustrates a residual error of the distancebetween an image edge feature and a model edge feature on the capturedimage after the position and orientation of the three-dimensional modelmoves just Δs. In addition, S=Σ(w_(i)F_(i))² is the sum of squares ofthe weighted residual error. Accordingly, solving Equation (13) as shownin Equation (14) corresponds to minimizing the evaluation function Sobtained by weighting, by the reliability w_(i), the evaluation valueF_(i), which indicates the fitting error, for each model edge feature.In other words, the correction value Δs of the position and orientationfor which the degree of matching between the image of thethree-dimensional model and the image of the target object becomeshigher is obtained by the processing of step S605.

In step S606, the matching unit 107 uses the correction value Δscalculated in step S604 to update the position and orientation of thethree-dimensional model to s+Δs. It is possible to correct the positionand orientation s so that the distance between the feature of thethree-dimensional model and the associated feature of the target objectis small.

In step S607, the matching unit 107 determines whether the position andorientation of the three-dimensional model converges. When it isdetermined that the position and orientation has converged, the matchingunit 107 outputs the position and orientation of the three-dimensionalmodel at that time as an estimated position and orientation of thetarget object obtained by the fitting. The processing then terminates.If it is determined that the position and orientation are notconverging, the processing returns to step S602, and the matching unitrepeats performance of the processing of steps S602 to S606 until theposition and orientation converge. To simplify the processing, theprocessing may return to step S603 when the position and orientation arenot converging.

In the present embodiment, if the correction value Δs obtained in stepS605 is less than or equal to a predetermined value and mostly does notchange, the matching unit 107 determines that the position andorientation have converged. For example, if the correction value Δscontinues to be less than or equal to the predetermined value for apredetermined number of times, the matching unit 107 can determine thatthe position and orientation have converged. A method of determiningconvergence is not limited to this method. For example, if the number ofiterations of steps S602 to S606 has reached a predetermined number oftimes, the matching unit 107 may determine that the position andorientation have converged.

As explained above, the estimated position and orientation arecalculated for the target object by the processing of steps of S601 toS606. In the above-described explanation, explanation was given of acase in which the three-dimensional position and orientation of thetarget object were calculated. However, in another embodiment, it ispossible to calculate a two-dimensional position and orientation of thetarget object. In such a case, a three-dimensional vector having twoelements that represent position and one element that representsorientation is used as the vector s. The method of calculating theposition and orientation of the target object is not limited to theabove described Gauss-Newton method. For example, it is possible to usea Levenberg-Marquardt method, which is more robust, and it is possibleto use a steepest descent method, which is simpler. Furthermore, anothernonlinear optimized calculation method, such as a conjugate gradientmethod or an ICCG method, can be used.

By the first embodiment described above, an image edge feature can beextracted at high precision by considering degradation of the capturedimage in accordance with bokeh, blur, and the like. The position andorientation of the target object are estimated by causing a model edgefeature to fit an image edge feature while performing weighting of edgefeatures in consideration of degradation of the captured image inaccordance with bokeh, blur, or the like. Accordingly, it is possible toimprove the precision of estimation of the position and orientation ofthe target object.

[First Variation]

In the first embodiment, the deterioration degree is set for each modeledge feature. However, the deterioration degree may be set for eachpixel or for each segmented area of the captured image. For example,between step S305 and step S306, the deterioration degree calculationunit 101 can project a model edge feature for which the deteriorationdegree has been calculated onto the captured image surface, anddetermine the position of the image of the model edge feature. For eachpixel included in a segmented area of the captured image that includesthe position of the image of the model edge feature, the deteriorationdegree holding unit 102 can set the deterioration degree calculated forthis model edge feature.

In such a case, the reliability calculation unit 105 may use thedeterioration degree to set a reliability for each pixel or eachsegmented area of the captured image. In such a variation, the matchingunit 107 can perform fitting processing while using the reliability setfor a pixel at which the image edge feature is detected as thereliability for the image edge feature.

[Second Variation]

In step S205 in the first embodiment, one filter is set, and this onefilter is applied to the captured image on the whole. However, aplurality of filters may be used for one captured image. In other words,for each position of the captured image, the feature extraction unit 104can set an extraction parameter used to extract a feature at thisposition in accordance with the deterioration degree of the image forthis position.

For example, if the deterioration degree is set for each segmented areaof the captured image as in the first variation, the feature extractionunit 104 can set the extraction filter by using the deterioration degreeset for the pixel to which the filter is to be applied. For example, thefeature extraction unit 104 can determine the size of the extractionfilter as described above, by considering the bokeh amount D and theblur amount B, or by considering the deterioration degree σ₀. In thisway, it is possible to extract the edge at higher precision by using asuitable extraction filter in accordance with a deterioration degree ofeach region of the captured image.

[Third Variation]

In the first embodiment, after an image edge feature was extracted instep S205, association of a model edge feature and the image edgefeature was performed in step S206. However, configuration may be takento omit step S205 and extract an image edge feature corresponding to themodel edge feature in step S206.

More specifically, in step S602, the matching unit 107 searches for animage edge feature that is present on the search line 703. At thispoint, in the first embodiment an image edge feature is detected byreferring to the edge intensity map, but in the present variation animage edge feature is detected by referring to the captured image.Specifically, the matching unit 107 calculates a luminance gradient ofthe captured image along the search line 703 in the captured image byperforming a convolution calculation that uses the one-dimensionalextraction filter set in step S403 with respect to a one-dimensionalarray of pixel values present on the search line 703. The matching unit107 detects, as the image edge feature, a point for which the absolutevalue of the luminance gradient is an extreme value.

[Fourth Variation]

In the first embodiment the feature extraction unit 104 extracts animage edge feature referring to the deterioration degree. The matchingunit 107 also estimates the position and orientation of the targetobject using a reliability defined based on a deterioration degree forweighting. However, it is not essential to use the deterioration degreefor both the extraction of the image edge feature and the weighting. Inthe fourth variation, the feature extraction unit 104 refers to thedeterioration degree and extracts the image edge feature similarly to inthe first embodiment. However, the matching unit 107 estimates theposition and orientation of the target object without performingweighting based on the reliability.

The information processing apparatus 1 according to the presentvariation has a similar configuration as in the first embodiment,excluding a point of not having the reliability calculation unit 105.Processing in the present variation is similar to in the firstembodiment except for the processing of step S206. Explanation is givenbelow regarding the processing of step S206 in the present variationwith reference to the flowchart of FIG. 6.

Steps S601 to S603 are performed similarly to in the first embodiment.The processing of step S604 is omitted. In step S605, the matching unit107 calculates the correction value Δs by solving Equation (11). Morespecifically, Equation (11) can be solved as shown in Equation (15).

[EQUATION 15]

Δs=(J ^(T) J)⁻¹ J ^(T) E  (15)

In the present variation, an image edge feature can be extracted at highprecision by considering degradation of the captured image in accordancewith bokeh, blur, and the like. Thereby, it is possible to improve theprecision of estimation of the position and orientation of the targetobject based on the extracted image edge feature.

[Fifth Variation]

In the fifth variation, the matching unit 107, similarly to in the firstembodiment, estimates the position and orientation of the target objectusing a reliability defined based on a deterioration degree forweighting. However, the feature extraction unit 104 extracts the imageedge feature by applying a predetermined extraction filter to thecaptured image without using a deterioration degree.

The information processing apparatus 1 according to the presentvariation has a configuration similar to in the first embodiment.Processing in the present variation is similar to in the firstembodiment except for the processing of step S205. In step S205 of thepresent variation, the feature extraction unit 104 extracts the imageedge feature by applying a filter to the obtained image. The filter usedis not particularly limited, and for example, a differential filter ofan optional shape can be used.

In the present variation, the position and orientation of the targetobject are estimated by causing a model edge feature to fit an imageedge feature while performing weighting of edge features inconsideration of degradation of the captured image in accordance withbokeh, blur, or the like. Accordingly, it is possible to improve theprecision of estimation of the position and orientation of the targetobject.

[Sixth Variation]

The feature extraction unit 104 in the first embodiment sets a filterfor extracting an edge feature based on a deterioration degree, andextracts the image edge feature by applying the set filter to thecaptured image. In the sixth variation, the filter for extracting theimage edge feature is fixed. However, the filter is applied after thesize of the captured image is changed.

The information processing apparatus 1 according to the presentvariation has a configuration similar to in the first embodiment.Processing in the present variation is similar to in the firstembodiment except for the processing of step S205. Explanation is givenbelow regarding the processing of step S205 in the present variationwith reference to the flowchart of FIG. 9. The processing of step S901and step S902 is similar to step S401 and step S402, respectively.

In step S903, the feature extraction unit 104 sets a resizing rate forthe captured image in accordance with the deterioration degree obtainedin step S902. In the present variation, the resizing rate is calculatedso that the captured image becomes small to the extent that thedeterioration degree (for example the bokeh amount D and the blur amountB) is large.

As a detailed example, similarly to in the first embodiment, the featureextraction unit 104 uses the bokeh amount D and the blur amount B toestimate a waveform S of the image of the model edge feature on thecaptured image. Next, the feature extraction unit 104 sets a resizingrate R so that the waveform S is within a predetermined spread. In thepresent variation, the feature extraction unit 104 calculates a standarddeviation Z of the waveform S, and calculates the resizing rate R=E/Zsuch that the standard deviation becomes a predetermined value E. Here,the predetermined value E is a value defined in accordance with a filterfor extracting an image edge feature. In the present variation, as thefilter for extracting the image edge feature, a filter in accordancewith a waveform obtained by differentiating a Gaussian function of thestandard deviation E is used. A method of setting the resizing rate R isnot limited to this method. For example, to reduce the computationalcomplexity, configuration may be taken such that R=E/(D+B). Even in thiscase, the resizing rate R is calculated so that the captured imagebecomes small to the extent that the bokeh amount D and the blur amountB are large.

In step S904, the feature extraction unit 104 extracts an image edgefeature from the captured image obtained by the image obtaining unit103, based on the resizing rate R set in step S903. More specifically,firstly the feature extraction unit 104 resizes the captured image inaccordance with the resizing rate R. If the resizing rate R is greaterthan 1, the captured image is magnified. Next, the feature extractionunit 104 extracts an image edge feature by applying a filter that isprepared in advance to the resized captured image. The coordinates of animage edge feature detected from the resized captured image can beconverted into coordinates of the image edge feature on the capturedimage before the resizing by referring to the resizing rate and, forexample, multiplying by 1/R.

In the method of the present variation, an image edge feature can beextracted at high precision by considering degradation of the capturedimage in accordance with bokeh, blur, and the like. Thereby, it ispossible to improve the precision of estimation of the position andorientation of the target object based on the extracted image edgefeature.

[Seventh Variation]

In the first embodiment a filter for extracting an image edge featurebased on a deterioration degree is set, and the image edge feature isextracted by applying the set filter to the captured image. In theseventh variation, the extraction parameter is set based on thedeterioration degree. Specifically, a plurality of filters are set, andthe plurality of filters are used to extract the image edge feature.

In the present variation, it is considered that the bokeh amount D andthe blur amount B that the deterioration degree calculation unit 101calculates can include an error. Accordingly, the feature extractionunit 104 sets a plurality of bokeh amount D±ΔD based on the bokeh amountD that the deterioration degree calculation unit 101 calculated.Similarly, the feature extraction unit 104 sets a plurality of bluramount B±ΔB based on the blur amount B that the deterioration degreecalculation unit 101 calculated. The feature extraction unit 104 useseach combination of the bokeh amount D±ΔD and the blur amount B±ΔB toset a filter for extracting the image edge feature, and extract theimage edge feature from the captured image. Furthermore, the featureextraction unit 104 selects at least one extraction result from aplurality of extraction results in accordance with a response value ofthe filter processing. Specifically, the feature extraction unit 104determines and outputs an extraction result for which an extractionprecision is assumed to be relatively high, by comparing the responsevalue of the filter.

The information processing apparatus 1 according to the presentvariation has a configuration similar to in the first embodiment.Processing in the present variation is similar to in the firstembodiment except for the processing of step S205. Explanation is givenbelow regarding the processing of step S205 in the present variationwith reference to the flowchart of FIG. 10. The processing of step S1001and step S1002 is similar to step S401 and step S402, respectively.

In step S1003, the feature extraction unit 104 sets a plurality ofextraction filters based on deterioration degrees obtained in stepS1002. Firstly, the feature extraction unit 104 applies a change in apredetermined range (ΔD, ΔB) with respect to the bokeh amount D and theblur amount B. Specifically, a plurality of bokeh amounts and bluramounts are set in the ranges of D±ΔD and B±ΔB. The variation range ofthe bokeh amount D and the blur amount B and the number of bokeh amountsand blur amounts set may be set in advance. Next, by combining the setplurality of bokeh amounts and blur amounts, the feature extraction unit104 sets the plurality of extraction filters. Setting of the extractionfilters can be performed similarly to in the first embodiment.

In step S1004, the feature extraction unit 104 uses each of theplurality of extraction filters set in step S1003 to extract an imageedge feature from the captured image obtained by the image obtainingunit 103. Thus, the plurality of extraction results for the image edgefeature are obtained in accordance with each extraction filter.

Next, the feature extraction unit 104 selects from the extracted imageedge features an image edge feature for which the extraction precisionis assumed to be relatively high. In the present variation, anextraction result for which a response value of a filter is large isselected. More specifically, an image edge feature for which an extremevalue of the luminance gradient calculated by applying the extractionfilter is larger is selected. As a specific example, the featureextraction unit 104 selects an image edge feature for which theextraction precision is assumed to be relatively high from a group ofimage edge features that are present at the same position and areextracted using differing extraction filters. Specifically, from aplurality of neighboring edge features included in a predetermined rangeE[pixel], the feature extraction unit 104 selects an edge feature forwhich an extreme value of the luminance gradient which is a responsevalue of a respective extraction filter is maximum. The featureextraction unit 104 may select two or more edge features.

According to the method of the present variation, an image edge featurecan be extracted at high precision by considering degradation of thecaptured image in accordance with bokeh, blur, and the like.Furthermore, by using a plurality of extraction filters that considererror of bokeh and blur, it is possible to extract the image edgefeature with a high precision in comparison to a case of using one typeof extraction filter. Thereby, it is possible to improve the precisionof estimation of the position and orientation of the target object basedon the extracted image edge feature.

In the present variation, explanation was given for a method of settinga plurality of extraction filter shapes. However, as shown in the sixthvariation, a plurality of image resizing rates may be set. In such acase, the plurality of detection results for the image edge feature areobtained in accordance with each resizing rate. Also, an image edgefeature for which an extreme value of the luminance gradient calculatedby applying the extraction filter to the captured image after resizingis larger is selected.

[Eighth Variation]

In the first embodiment, the deterioration degree calculation unit 101calculates the deterioration degree by a simulation that considers imagecapturing conditions according to the image capturing apparatus 108 anda three-dimensional model of the target object. In the eighth variation,the deterioration degree is calculated from the captured image obtainedby the image capturing apparatus capturing the target object. Thefollowing processing is performed on a plurality of captured images ofthe target object, which are captured while changing the relativeposition and orientation of the target object with respect to the imagecapturing apparatus. The deterioration degree calculated by using eachcaptured image is held by the deterioration degree holding unit 102 inassociation with the relative position and orientation of the targetobject with respect to the image capturing apparatus.

Configuration and Processing of the information processing apparatus 1according to an eighth variation is similar to in the first embodiment,except for the processing of step S202. Explanation is given belowregarding the processing of step S202 in the eighth variation withreference to the flowchart of FIG. 11.

In step S1100, the deterioration degree calculation unit 101 extracts aplurality of edge features from a captured image of the target objectobtained by the image obtaining unit 103. The method for extracting theedge features is not particularly limited, and for example, an optionaldifferential filter can be used. In step S1101, the deterioration degreecalculation unit 101 selects one edge feature from the plurality of edgefeatures extracted in step S1100.

In step S1102, the deterioration degree calculation unit 101 calculatesa deterioration degree of the edge feature selected in step S1101. Inthe present variation, the deterioration degree calculation unit 101calculates the deterioration degree of the edge feature based on theposition of the edge feature and the normal direction of the edge. As adetailed example, by using Equation (16), the deterioration degreecalculation unit 101 estimates the luminance value at a pixel ofinterest on the edge feature in a case where it is assumed that thecaptured image is degraded in accordance with the deterioration degreeσ.

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 16} \right\rbrack & \; \\{{{erf}\left( {r,\theta,\sigma} \right)} = {\frac{2}{\sqrt{\pi}}{\int_{- t}^{t}{{\exp \left( {- \frac{\left( {{r\; \cos \; \theta} - x_{0}} \right)^{2} + \left( {{r\; \sin \; \theta} - y_{0}} \right)^{2}}{\sigma^{2}}} \right)}\ {r}}}}} & (16)\end{matrix}$

In Equation (16), x₀, y₀ are the position of a pixel of interest wherethe edge feature is present, r expresses a distance from the position ofa pixel of interest, and θ expresses a normal direction(two-dimensional) of the edge feature. In addition, t expresses a searchrange from the position of a pixel of interest, and σ is thedeterioration degree. The value of the deterioration degree σcorresponds to a value in which the bokeh amount and the blur amount areunified. t indicates an optional positive value.

As shown in Equation (17), the sum of squares of the difference betweena luminance value on the captured image and a luminance value at thesame position calculated in accordance with Equation (16) for each pixelthat constitutes the edge feature is made to be an evaluation functionE. The deterioration degree calculation unit 101 then estimates aparameter σ by minimizing the evaluation function E by an iterativecalculation. For the minimization of the evaluation function E apublicly known method can be used; for example, a steepest descentmethod, a Levenberg-Marquardt method, or the like can be used. InEquation (17), I(x, y) shows a luminance value of the captured image atthe coordinates (x, y).

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 17} \right\rbrack & \; \\{E = {\sum\limits_{y}\; {\sum\limits_{x}\; \left\{ {{I\left( {x,y} \right)} - {{erf}\left( {r,\theta,\sigma} \right)}} \right\}^{2}}}} & (17)\end{matrix}$

In step S1103, the deterioration degree calculation unit 101 determineswhether the deterioration degree is calculated for all edge features. Ifthe calculation of the deterioration degree is not completed for alledge features, the processing returns to step S1101, and thedeterioration degree is calculated for the next edge feature. When thecalculation of the deterioration degree for all edge featuresterminates, the processing of step S202 terminates. In such a case, thedeterioration degree holding unit 102 can hold the deterioration degreefor each pixel or for each segmented area of the captured image,similarly to in the first variation. For example, for each pixelincluded in a segmented area of the captured image that includes theposition of the image of the extracted edge feature, the deteriorationdegree calculated for this edge feature can be set.

As discussed above, according to the method of the present variation, itis possible to calculate the deterioration degree from the image thatcaptures the target object. Thereby, it is possible to more accuratelycalculate the deterioration degree by considering an effect that cannotbe fully expressed by only image capturing parameters of the imagecapturing apparatus 108 and the three-dimensional model of the targetobject, or the like, e.g. an effect due to noise. By referring to thisdeterioration degree, it is possible to extract an image edge feature athigh precision, and in addition, it is possible to cause a model edgefeature to fit the image edge feature while performing weighting withrespect to the edge feature. Accordingly, it is possible to improve theprecision of estimation of the position and orientation of the targetobject.

Instead of using Equation (16), it is possible to estimate the luminancevalue at a pixel of interest on an edge feature when it is assumed thatthe captured image is degraded in accordance with the bokeh amount D andthe blur amount B. In such a case, the deterioration degree calculationunit 101 can estimate the bokeh amount D and the blur amount B insteadof the deterioration degree σ.

[Ninth Variation]

In the eighth variation, the deterioration degree is calculated from thecaptured image of the target object. In the ninth variation, based onimage capturing conditions for the target object according to the imagecapturing apparatus, the captured image obtained by the image capturingapparatus capturing the target object is estimated by using thethree-dimensional model of the target object. More specifically,processing that reproduces (restores) the degradation in the imageobtained by projecting the three-dimensional model on the captured imageis performed. In this processing, the three-dimensional model of thetarget object and the image capturing condition according to the imagecapturing apparatus 108 are referred to. The processing in the presentvariation is similar to an eighth variation, except in that step S1100is different. Explanation is given below of the processing of stepS1100.

In step S1100, the deterioration degree calculation unit 101 projectsthe three-dimensional model of the target object onto the capturedimage. The deterioration degree calculation unit 101 then performsprocessing that reproduces the degradation with respect to the image forwhich projection is performed. In the present variation, thedeterioration degree calculation unit 101 generates an image thatreproduces the bokeh and the blur.

Explanation is given below of an example of a method that reproduces thebokeh and the blur. Firstly, for each pixel of the projection image, thedeterioration degree calculation unit 101 calculates the bokeh amount Dand the blur amount B on the projection image based on thethree-dimensional position on the three-dimensional model correspondingto the pixel. The bokeh amount D can be calculated in accordance withEquation (2), by using the three-dimensional position corresponding tothe respective pixel of the projection image.

The blur amount B can be calculated as follows. Firstly, a blur amountB_(3D) in a three-dimensional space is calculated. The blur amountB_(3D) can be obtained in accordance with B_(3D)=tiJ3DV. Here, J_(3D) isa Jacobian of a three-dimensional position corresponding to a respectivepixel of the projection image; t_(i) is the exposure time; V is therelative speed of the target object with respect to the image capturingapparatus 108. The Jacobian J_(3D) is a value that expresses a rate atwhich a three-dimensional position corresponding to a respective pixelof the projection image changes when a position and orientationsix-degrees-of-freedom parameter has slightly changed. The JacobianJ_(3D) can be calculated in accordance with Equation (18) and Equation(19).

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 18} \right\rbrack & \; \\{J_{3\; D} = \left\lbrack {\frac{\partial{err}_{3\; D}}{\partial s_{1}}\mspace{14mu} \frac{\partial{err}_{3\; D}}{\partial s_{2}}\mspace{14mu} \frac{\partial{err}_{3\; D}}{\partial s_{3}}\mspace{14mu} \frac{\partial{err}_{3\; D}}{\partial s_{4}}\mspace{14mu} \frac{\partial{err}_{3\; D}}{\partial s_{5}}\mspace{14mu} \frac{\partial{err}_{3\; D}}{\partial s_{6}}} \right\rbrack} & (18) \\\left\lbrack {{EQUATION}\mspace{14mu} 19} \right\rbrack & \; \\{{err}_{3\; D} = {\left( {x^{\prime} - x} \right) + \left( {y^{\prime} - y} \right) + \left( {z^{\prime} - z} \right)}} & (19)\end{matrix}$

In Equation (18), s indicates the position and orientation of the targetobject. In addition, err_(3D) indicates a movement vector of thethree-dimensional position during exposure, when the three-dimensionalposition at the start of exposure is (x, y, z) and the three-dimensionalposition at the end of exposure is (x′, y′, z′).

It is possible to calculate the blur amount B (a two-dimensional vector)on the projection image by projecting the blur amount B_(3D) (athree-dimensional vector) in the three-dimensional space onto theprojection image.

Next, the deterioration degree calculation unit 101 generates, based onthe obtained bokeh amount D and blur amount B, an image in which thebokeh and the blur are reproduced from the projection image. Morespecifically, the deterioration degree calculation unit 101 generates animage in which the bokeh and the blur are reproduced by performing foreach pixel of the projection image a convolution calculation with aGaussian function of a standard deviation D, as well as a convolutioncalculation with a rectangular wave having width B and height 1.

Subsequently, the deterioration degree calculation unit 101 extracts aplurality of edge features from the obtained image in which the bokehand the blur are reproduced. Except for the using of the image in whichthe bokeh and the blur are reproduced instead of the captured image ofthe target object, this processing can be performed similarly to in theeighth variation.

As described above, according to the method of the present variation, animage in which the bokeh and the blur are reproduced is generated basedon image capturing conditions such as the relative position andorientation of the target object with respect to the image capturingapparatus 108 and the exposure time, in addition to thethree-dimensional model of the target object. The deterioration degreeis then is calculated based on this image. By referring to thisdeterioration degree, it is possible to extract an image edge feature athigh precision, and in addition, it is possible to cause a model edgefeature to fit the image edge feature while performing weighting withrespect to the edge feature. Accordingly, it is possible to improve theprecision of estimation of the position and orientation of the targetobject.

The method of generation of the image in which the degradation isreproduced is not limited to the method described above. For example, itis possible to use a method that reproduces the bokeh and blur by usingan image filter, such as is disclosed in chapter 6 of Kazuyuki Tanaka,“Introduction of Image Processing Techniques by Probabilistic Models”.

Second Embodiment

In the first embodiment, the position and orientation of the targetobject are estimated by causing image coordinates of a model edgefeature of a three-dimensional model of a target object projected onto acaptured image surface to fit image coordinates of an image edge featureextracted from the captured image. In the second embodiment, theposition and orientation of the target object are estimated by causingthree-dimensional coordinates of a feature (for example, a surface) of athree-dimensional model of the target object to fit three-dimensionalcoordinates of a feature (for example, a three-dimensional point)calculated from the captured image.

In the present embodiment, the feature extraction unit 104 calculates athree-dimensional position of a feature on an image of the targetobject, based on a captured image obtained by the image capturingapparatus capturing the target object. More specifically, the featureextraction unit 104 calculates the three-dimensional position of a pointon the image of the target object. The three-dimensional position, forexample, can be measured by irradiating an illumination pattern onto thetarget object. In the present embodiment, as shown in FIG. 12, anillumination pattern 1201 is projected on a target object 1205 from anirradiation apparatus 1200. An image capturing apparatus 1203 is thenused to capture the target object 1205 on which the illumination pattern1201 is projected. In the present embodiment, the illumination pattern1201, which includes a plurality of dotted lines, is projected.

A three-dimensional position of an object surface on which theillumination pattern 1201 is projected is calculated based on theillumination pattern 1201, a captured image 1204, and the positionrelation between the irradiation apparatus 1200 and the image capturingapparatus 1203. More specifically, a position of a feature of interestin the illumination pattern 1201 projected by the irradiation apparatus1200, a position on the captured image 1204 at which the projected thefeature of interest was extracted, and a relative position andorientation of the image capturing apparatus 1203 with respect to theirradiation apparatus 1200 are obtained. Here, the position of thefeature of interest in the illumination pattern 1201 corresponds to aprojection direction of the feature of interest from the irradiationapparatus 1200, and the position on the captured image 1204 at which theprojected feature of interest is extracted corresponds to an observationdirection of the feature of interest from the image capturing apparatus1203. Accordingly, it is possible to calculate the three-dimensionalposition of the feature of interest in accordance with a principle of atriangulation method.

In the present embodiment, the feature is extracted from the capturedimage 1204. In the present embodiment, the illumination pattern 1201includes a plurality of measurement lines 1202, and the featureextracted from the illumination pattern 1201 that appears in thecaptured image 1204 is a point on a line segment. An example of a methodof extracting the point on the line segment is explained below. Firstly,a luminance gradient distribution is obtained by applying a differentialfilter to the captured image 1204. A line segment configured by a pointat which the luminance gradient is an extreme value is extracted.Furthermore, the luminance gradient distribution on the line segment isobtained by applying the differential filter on the thus extracted linesegment. The point at which the luminance gradient on the line segmentbecomes an extreme value is then extracted as the feature. Thus, it ispossible to extract a feature from the captured image 1204, and obtainthe position of the feature. The position of the feature in theillumination pattern 1201 may be defined in advance. Configuration maybe taken to use a similar method to extract a feature in theillumination pattern 1201 and obtain the position of the feature.

A type of the illumination pattern 1201 is not particularly limited ifit is possible to extract a feature included in the illumination pattern1201 from the target object 1205 on which the illumination pattern 1201is projected. For example, there is no necessity for the illuminationpattern 1201 to include a line, and the illumination pattern 1201 mayinclude any geometric shape. As an example, FIG. 13 shows anillumination pattern 1300 that includes a plurality of points asfeatures.

The method of calculating the three-dimensional position is not limitedto the above described method. It is possible to use any method that cancalculate the three-dimensional position on a face of the target object1205 based on an image obtained by capturing the target object 1205. Forexample, it is possible to capture the target object 1205 by using aplurality of image capturing apparatuses arranged at differentpositions. In such a case, it is possible to calculate thethree-dimensional position of a feature of interest through a principleof a triangulation method, by using relative positions and orientationsof the image capturing apparatuses and the position of the feature ofinterest extracted from each captured image. Furthermore, it is possibleto calculate the three-dimensional position for the point on the imageof the target object via the above method.

Next, explanation is given regarding processing performed in the presentembodiment with reference to the flowchart of FIG. 2. The informationprocessing apparatus 1 according to the present embodiment has a similarconfiguration to that of the first embodiment, and explanation is givenbelow regarding points of difference. In the present embodiment, themodel holding unit 106 holds the three-dimensional model of the targetobject 1205. The three-dimensional model of the target object 1205 thatthe model holding unit 106 holds is, for example, comprised byinformation that indicates a position of a group of surfaces or a groupof points positioned on a face of the target object 1205. Explanation isgiven below of a case when the three-dimensional model of the targetobject 1205 is comprised by a group of surfaces.

Step S201 is performed similarly to in the first embodiment. Step S202and step S203 are performed similarly to in the first embodiment, exceptthat a deterioration degree is calculated and held for each point on aface of a three-dimensional model instead of calculating a deteriorationdegree for each model edge feature. The calculation of a deteriorationdegree for a point, can be performed similarly to the calculation of adeterioration degree for a model edge feature. For example, thedeterioration degree calculation unit 101 can calculate a bokeh amountby using image capturing conditions of the image capturing apparatus1203, as well as a distance between the point and the image capturingapparatus 1203, or the like. The deterioration degree calculation unit101 can calculate a movement amount on a captured image surface of thepoint during an exposure time as the blur amount. Configuration may betaken so that the deterioration degree calculation unit 101 and thedeterioration degree holding unit 102 calculate and hold a deteriorationdegree for each surface of the three-dimensional model.

In step S204, the feature extraction unit 104 obtains the captured image1204 obtained by capturing the target object 1205 on which theillumination pattern 1201 is projected, as described above. In stepS205, the feature extraction unit 104 extracts a three-dimensional pointas described above from the captured image obtained in step S204, andrecords the three-dimensional position of the three-dimensional point.Below, the extracted three-dimensional point is referred to as ameasurement feature.

Below, explanation is given in detail of the processing in step S206,with reference to the flowchart of FIG. 6. The processing of step S606and step S607 is similar to the first embodiment, and the explanation isomitted.

In step S601, the matching unit 107 performs initialization processing.Firstly, the matching unit 107 obtains the three-dimensional model ofthe target object 1205 from the model holding unit 106. Also, thematching unit 107 obtains a coarse position/orientation of the targetobject 1205, and sets it as the position and orientation s of thethree-dimensional model.

In step S602, the matching unit 107 associates the three-dimensionalpoint extracted in step S205 and the feature on the three-dimensionalmodel obtained in step S601. In the present embodiment, for eachmeasurement feature, the matching unit 107 detects a surface in thethree-dimensional model image for which the distance is closest. Thematching unit 107 then associates the detected surface with themeasurement feature. Below, the surface of the three-dimensional modelassociated with the measurement three-dimensional point is referred toas a model feature.

In steps S603 to S605, the matching unit 107 uses a Gauss-Newton methodto calculate the position and orientation of the target object,similarly to in the first embodiment. Specifically, the matching unit107 repeatedly updates the relative position and orientation of thethree-dimensional model with respect to the viewpoint so that the degreeof matching between the image of the target object and the image of thethree-dimensional model becomes larger. In the present embodiment too,the degree of matching is obtained based on a difference with acorresponding model feature for each measurement feature, which isweighted in accordance with a deterioration degree corresponding to theposition at which the measurement feature was extracted. The differencebetween a measurement feature and a model feature is the distancebetween the three-dimensional position of the measurement feature andthe three-dimensional position of the model feature. The matching unit107, similarly to in the first embodiment, calculates a degree ofmatching with the image of the target object for a plurality ofthree-dimensional model images obtained by changing the position andorientation. From the plurality of three-dimensional model images, thethree-dimensional model image that provides the highest degree ofmatching may be determined to be the three-dimensional model imagecorresponding to the image of the target object.

In step S603, the matching unit 107 calculates a coefficient matrix andan error vector, as follows. Each element of the coefficient matrix is afirst-order partial differential coefficient corresponding to a slightchange of the position and orientation of the three-dimensional modelfor the coordinates of the measurement feature, and more specifically isa partial differential coefficient of three-dimensional coordinates. Theerror vector is a distance in a three-dimensional space between ameasurement feature and a model feature.

Three-dimensional coordinates of a group of points in a cameracoordinate system, e.g. a coordinate system based on a position and anoptical axis direction of the image capturing apparatus 1203, areconverted into three-dimensional coordinates (x, y, z) in a coordinatesystem of the target object 1205 by using the position and orientation sof the target object 1205. Here, the three-dimensional coordinates of ameasurement feature in the camera coordinate system are converted intothree-dimensional coordinates (x₀, y₀, z₀) in the target objectcoordinate system, according to the coarse position/orientation of thetarget object 1205. (x, y, z) is changed by the position and orientations of the target object 1205, and can be approximated as in Equation (20)by a first order Taylor expansion in the neighborhood of (x₀, y₀, z₀).

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 20} \right\rbrack & \; \\{{x \approx {x_{0} + {\sum\limits_{i = 1}^{6}\; {\frac{\partial x}{\partial s_{i}}\Delta \; s_{i}}}}}{y \approx {y_{0} + {\sum\limits_{i = 1}^{6}\; {\frac{\partial y}{\partial s_{i}}\Delta \; s_{i}}}}}{z \approx {z_{0} + {\sum\limits_{i = 1}^{6}\; {\frac{\partial z}{\partial s_{i}}\Delta \; s_{i}}}}}} & (20)\end{matrix}$

An equation in a model coordinate system of a model feature associatedwith a measurement feature is configured to bea_(x)+b_(y)+c_(z)=e(a²+b²+c²=1; a, b, c and e are constants). If theposition and orientation s of the target object 1205 is accurate, whenthe three-dimensional coordinates (x, y, z) of a measurement feature inthe camera coordinate system are converted to three-dimensionalcoordinates in the target object coordinate system, it can be consideredthat the three-dimensional coordinates after conversion satisfy theabove described equation. If this is assumed, Equation (21) is obtainedby substituting Equation (20) into the equation of the surface,a_(x)+b_(y)+c_(z)=e(a²+b²+c²=1).

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 21} \right\rbrack & \; \\{{{{a{\sum\limits_{i = 1}^{6}\; {\frac{\partial x}{\partial s_{i}}\Delta \; s_{i}}}} + {b{\sum\limits_{i = 1}^{6}\; {\frac{\partial y}{\partial s_{i}}\Delta \; s_{i}}}} + {c{\sum\limits_{i = 1}^{6}\; {\frac{\partial z}{\partial s_{i}}\Delta \; s_{i}}}}} = {e - q}}{q = {{ax}_{0} + {by}_{0} + {cz}_{0}}}} & (21)\end{matrix}$

In Equation (21) q is a constant.

Equation (21) holds true for all measurement features for which acorresponding model feature is detected in step S602. Accordingly, alinear simultaneous equation for Δs_(i), as in Equation (22), holdstrue.

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 22} \right\rbrack & \; \\{\begin{bmatrix}{{a_{1}\frac{\partial x_{1}}{\partial s_{1}}} + {b_{1}\frac{\partial y_{1}}{\partial s_{1}}} + {c_{1}\frac{\partial z_{1}}{\partial s_{1}}}} & {{a_{1}\frac{\partial x_{1}}{\partial s_{2}}} + {b_{1}\frac{\partial y_{1}}{\partial s_{2}}} + {c_{1}\frac{\partial z_{1}}{\partial s_{2}}}} & \ldots & {{a_{1}\frac{\partial x_{1}}{\partial s_{6}}} + {b_{1}\frac{\partial y_{1}}{\partial s_{6}}} + {c_{1}\frac{\partial z_{1}}{\partial s_{6}}}} \\{{a_{2}\frac{\partial x_{2}}{\partial s_{1}}} + {b_{2}\frac{\partial y_{2}}{\partial s_{1}}} + {c_{2}\frac{\partial z_{2}}{\partial s_{1}}}} & {{a_{2}\frac{\partial x_{2}}{\partial s_{2}}} + {b_{2}\frac{\partial y_{2}}{\partial s_{2}}} + {c_{2}\frac{\partial z_{2}}{\partial s_{2}}}} & \ldots & {{a_{2}\frac{\partial x_{2}}{\partial s_{6}}} + {b_{2}\frac{\partial y_{2}}{\partial s_{6}}} + {c_{2}\frac{\partial z_{2}}{\partial s_{6}}}} \\\vdots & \vdots & \ddots & \vdots\end{bmatrix}{\quad{\begin{bmatrix}{\Delta \; s_{1}} \\{\Delta \; s_{2}} \\{\Delta \; s_{3}} \\{\Delta \; s_{4}} \\{\Delta \; s_{5}} \\{\Delta \; s_{6}}\end{bmatrix} = \begin{bmatrix}{e_{1} - q_{1}} \\{e_{2} - q_{2}} \\\vdots\end{bmatrix}}}} & (22)\end{matrix}$

In Equation (22), the matrix on the left side is a coefficient matrix,and the matrix on the right side is an error vector. In this way, thematching unit 107 calculates the coefficient matrix and the errorvector.

In step S604, the reliability calculation unit 105 calculates thereliability for each measurement feature. Calculation of the reliabilitycan be performed similarly to in the first embodiment. For example, thereliability calculation unit 105 obtains a corresponding deteriorationdegree for each measurement feature from the deterioration degreeholding unit 102. In the present embodiment, from respective points on aface of the three-dimensional model for which a deterioration degree arecalculated, it is possible to obtain the deterioration degree for apoint that is closest to the three-dimensional position of a measurementfeature as the deterioration degree for the measurement feature. Thereliability calculation unit 105 calculates an average of thedeterioration degrees for the each of the measurement features, andfurther calculates the reliabilities for the respective measurementfeatures by using Equation (1). In the present embodiment, c of Equation(1) is the average of the deterioration degrees for the measurementfeatures, and b is the deterioration degree for a measurement feature.The method of calculating the reliability is not limited to this method,and it is possible to use various methods as explained in the firstembodiment. In addition, configuration may be taken to divide athree-dimensional space into a plurality of regions, calculate astatistic of the deterioration degree for the measurement features foreach segmented area, and express the reliability by a function that hasthe calculated statistic as a parameter.

The calculated reliability is used as the weight for each measurementfeature. In the following explanation, as explained in the firstembodiment, the weight matrix W is defined based on reliability, as inEquation (12). The weight matrix W is a square matrix for whichcomponents other than diagonal elements are 0. A diagonal element is theweight for each model feature, i.e. the reliability w_(i) (i=1 toN_(c)). N_(c) is a total number of model features associated with themeasurement feature. Similarly to in the first embodiment, Equation (11)is transformed by using the weight matrix W as in Equation (13). In stepS605, the reliability calculation unit 105 obtains the correction valueΔs by solving Equation (13) as in Equation (14). The correction value Δsof the position and orientation for which the degree of matching betweenthe image of the three-dimensional model and the image of the targetobject becomes higher is obtained by the processing of step S605.

As explained above, the position and orientation of the target object isestimated by the processing of steps S601-S607.

In the present embodiment, explanation is given for a case when thethree-dimensional model of the target object 1205 is comprised by agroup of surfaces. However, the three-dimensional model of the targetobject 1205 may be configured by a group of points. In such a case, instep S602 it is possible to associate a point of the three-dimensionalmodel for which the distance is closest for each measurement feature. Itis then possible to obtain the correction value Δs of the position andorientation of the target object 1205, so that the evaluation functionobtained by weighting the distance between corresponding points with thereliability is minimum.

By the second embodiment explained above, the position and orientationof the target object is estimated by causing a model feature to fit ameasurement feature while performing weighting with respect to themeasurement feature considering degradation of the captured image inaccordance with the bokeh, the blur, and the like. Accordingly, it ispossible to improve the precision of estimation of the position andorientation of the target object.

Third Embodiment

In the first embodiment, the position and orientation of the targetobject are estimated by causing image coordinates of a model edgefeature of a three-dimensional model of a target object projected onto acaptured image surface to fit image coordinates of an image edge featureextracted from the captured image. In the second embodiment, theposition and orientation of the target object are estimated by causingthree-dimensional coordinates of a model feature (for example, asurface) of a three-dimensional model of the target object to fitthree-dimensional coordinates of a measurement feature (for example, athree-dimensional point) calculated from the captured image. In thethird embodiment, by combining the first and second embodiments, theposition and orientation of the target object is estimated by using bothfitting of image coordinates on the captured image and fitting ofthree-dimensional coordinates in a three-dimensional space.

Next, explanation is given regarding processing performed in the presentembodiment with reference to the flowchart of FIG. 2. The informationprocessing apparatus 1 according to the present embodiment has a similarconfiguration to that of the first embodiment, and explanation is givenbelow regarding points of difference.

Step S201 is performed similarly to in the first embodiment. In stepS202 and step S203, a deterioration degree for each model edge featureis calculated and held similarly to in the first embodiment, and inaddition, a deterioration degree for each surface of thethree-dimensional model is calculated and held, similarly to in thesecond embodiment. Step S204 is performed similarly to in the secondembodiment. In step S205, the feature extraction unit 104 extracts animage edge feature from the captured image, similarly to in the firstembodiment, and in addition extracts a measurement feature (athree-dimensional point), similarly to in the second embodiment.

Below, explanation is given in detail of the processing in step S206,with reference to the flowchart of FIG. 6. The processing of step S606and step S607 is similar to that of the first embodiment, and theexplanation is omitted. In step S601, the matching unit 107 performsinitialization processing similarly to in the first and secondembodiments.

In step S602, the matching unit 107 performs association of each groupof model edge features and each of group of image edge features,similarly to in the first embodiment. In addition, the matching unit 107associates a measurement feature and a model feature (surface of thethree-dimensional model), similarly to in the second embodiment.

In step S603, the matching unit 107 calculates an error vector and acoefficient matrix in order to solve a linear simultaneous equation.More specifically, the matching unit 107 performs both the processingexplained in the first embodiment and the processing explained in thesecond embodiment. Equation (23) is obtained by combining the errorvector and the coefficient matrix regarding an edge feature obtained inaccordance with the first embodiment, and the error vector and thecoefficient matrix regarding an edge feature obtained in accordance withthe second embodiment.

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 23} \right\rbrack & \; \\{\begin{bmatrix}{{\sin \; \theta_{1}\frac{\partial u_{1}}{\partial s_{1}}} - {\cos \; \theta_{1}\frac{\partial v_{1}}{\partial s_{1}}}} & {{\sin \; \theta_{1}\frac{\partial u_{1}}{\partial s_{2}}} - {\cos \; \theta_{1}\frac{\partial v_{1}}{\partial s_{2}}}} & \ldots & {{\sin \; \theta_{1}\frac{\partial u_{1}}{\partial s_{6}}} - {\cos \; \theta_{1}\frac{\partial v_{1}}{\partial s_{6}}}} \\{{\sin \; \theta_{2}\frac{\partial u_{2}}{\partial s_{1}}} - {\cos \; \theta_{2}\frac{\partial v_{2}}{\partial s_{1}}}} & {{\sin \; \theta_{2}\frac{\partial u_{2}}{\partial s_{2}}} - {\cos \; \theta_{2}\frac{\partial v_{2}}{\partial s_{2}}}} & \ldots & {{\sin \; \theta_{2}\frac{\partial u_{2}}{\partial s_{6}}} - {\cos \; \theta_{2}\frac{\partial v_{2}}{\partial s_{6}}}} \\\vdots & \vdots & \ddots & \vdots \\{{a_{1}\frac{\partial x_{1}}{\partial s_{1}}} + {b_{1}\frac{\partial y_{1}}{\partial s_{1}}} + {c_{1}\frac{\partial z_{1}}{\partial s_{1}}}} & {{a_{1}\frac{\partial x_{1}}{\partial s_{2}}} + {b_{1}\frac{\partial y_{1}}{\partial s_{2}}} + {c_{1}\frac{\partial z_{1}}{\partial s_{2}}}} & \ldots & {{a_{1}\frac{\partial x_{1}}{\partial s_{6}}} + {b_{1}\frac{\partial y_{1}}{\partial s_{6}}} + {c_{1}\frac{\partial z_{1}}{\partial s_{6}}}} \\{{a_{2}\frac{\partial x_{2}}{\partial s_{1}}} + {b_{2}\frac{\partial y_{2}}{\partial s_{1}}} + {c_{2}\frac{\partial z_{2}}{\partial s_{1}}}} & {{a_{2}\frac{\partial x_{2}}{\partial s_{2}}} + {b_{2}\frac{\partial y_{2}}{\partial s_{2}}} + {c_{2}\frac{\partial z_{2}}{\partial s_{2}}}} & \ldots & {{a_{2}\frac{\partial x_{2}}{\partial s_{6}}} + {b_{2}\frac{\partial y_{2}}{\partial s_{6}}} + {c_{2}\frac{\partial z_{2}}{\partial s_{6}}}} \\\vdots & \vdots & \vdots & \vdots\end{bmatrix}{\quad{\begin{bmatrix}{\Delta \; s_{1}} \\{\Delta \; s_{2}} \\{\Delta \; s_{3}} \\{\Delta \; s_{4}} \\{\Delta \; s_{5}} \\{\Delta \; s_{6}}\end{bmatrix} = \begin{bmatrix}{d_{1} - r_{1}} \\{d_{2} - r_{2}} \\\vdots \\{e_{1} - q_{1}} \\{e_{2} - q_{2}} \\\vdots\end{bmatrix}}}} & (23)\end{matrix}$

In step S604, the matching unit 107 calculates a reliability for animage edge feature corresponding to a model edge feature, similarly toin the first embodiment, and also calculates a reliability for ameasurement feature corresponding to a model feature, similarly to inthe second embodiment. The calculated reliabilities are used as a weightfor each image edge feature and measurement feature. In the followingexplanation, the weight matrix W is defined based on reliability, as inEquation (24).

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 24} \right\rbrack & \; \\{W = \begin{bmatrix}w_{{2\; d},1} & \; & \; & \; & \; & 0 \\\; & \ddots & \; & \; & \; & \; \\\; & \; & w_{{2\; d},{Na}} & \; & \; & \; \\\; & \; & \; & w_{{3\; d},1} & \; & \; \\\; & \; & \; & \; & \ddots & \; \\{0\;} & \; & \; & \; & \; & w_{{3\; d},1}\end{bmatrix}} & (24)\end{matrix}$

The weight matrix W is a square matrix for which components other thandiagonal elements are 0. The diagonal elements of the weight matrix Ware a reliability w_(2d,i)(i=1 to Na) for the image edge features, and areliability w_(3d,i)(i=1 to Nb) for the measurement features. Na is thetotal number of model edge features for which corresponding image edgefeatures are detected, and Nb is the total number of model featurescorresponding to measurement features.

Similarly to in the first embodiment, Equation (11) is transformed byusing the weight matrix W as in Equation (13). In step S605, thereliability calculation unit 105 obtains the correction value Δs bysolving Equation (13) as in Equation (14).

As explained above, the position and orientation of the target objectare estimated by the processing of steps S601-S607.

By the third embodiment explained above, the position and orientation ofthe target object are estimated by using both a measurement feature andan image edge feature while performing weighting with respect to themeasurement feature considering degradation of the captured image inaccordance with the bokeh, the blur, and the like. Accordingly, it ispossible to improve the precision of estimation of the position andorientation of the target object.

Fourth Embodiment

In the fourth embodiment, an example of application of the informationprocessing apparatus 1 according to the first to third embodiments isexplained with reference to FIG. 14. More specifically, the informationprocessing apparatus 1 estimates the position and orientation of thetarget object 1205 based on the captured image obtained by an imagecapturing apparatus 1400. An industrial robot 1401 operates the targetobject 1205 based on the estimated position and orientation of thetarget object 1205. As shown in FIG. 14, the robot system according tothe present embodiment comprises the information processing apparatus 1,the image capturing apparatus 1400, and the robot 1401.

The robot 1401 is for example an industrial robot, and comprises a robotarm that has a movable shaft. Movement of the robot arm is controlled bya robot controller, and it is possible to cause an end effector to moveto an instructed position. Thus the robot 1401 can perform an operationon an object, for example gripping the object or the like. The positionof the target object 1205 which is placed on a work table can change.Accordingly, for the robot 1401 to perform operation on the targetobject 1205, it is necessary to estimate the current position andorientation of the target object 1205, and control movement of the robotarm based on this estimation. The robot controller may be provided bythe robot 1401, or the robot 1401 may be connected to the robotcontroller.

The image capturing apparatus 1400 is a camera that capturestwo-dimensional images. As the image capturing apparatus 1400, it ispossible to use an ordinary camera. The image capturing apparatus 1400is placed at a position at which it is possible to capture the targetobject 1205. In an embodiment, the image capturing apparatus 1400 isprovided on the end effector of the robot arm comprised by the robot1401, i.e. on a hand that grips the object, or adjacent to the hand. Theimage capturing apparatus 1400 may be arranged at a position separatedfrom the robot 1401.

As explained in the first to third embodiments, the informationprocessing apparatus 1 estimates the position and orientation of thetarget object 1205 based on the captured image obtained from the imagecapturing apparatus 1400. As necessary, the irradiation apparatusexplained in the second embodiment may also be used. The position andorientation of the target object 1205 estimated by the informationprocessing apparatus 1 is transmitted to the robot controller. The robotcontroller controls the position and orientation of the robot arm basedon the obtained position and orientation of the target object 1205.Thus, the robot 1401 can perform operations, such as gripping or thelike of the target object 1205.

As described above, the robot system according to the fourth embodimentcan perform an operation with respect to the target object 1205 byestimating the position and orientation of the target object 1205, evenif the position of the target object 1205 is not fixed.

Fifth Embodiment

In the first to fourth embodiments, the position and orientation of thetarget object is estimated. In the fifth embodiment, specification of atype of the target object is performed. The information processingapparatus 1 according to the fifth embodiment has a similarconfiguration to that of the first embodiment; explanation is givenbelow of points of difference.

The deterioration degree calculation unit 101 and the deteriorationdegree holding unit 102 calculate and hold the deterioration degree. Inthe present embodiment, for example as explained in the first, eighthand ninth variations, the deterioration degree is calculated and heldfor each pixel. In the present embodiment, the deterioration degree isalso calculated and held for each later-described projection image.

The feature extraction unit 104 extracts features from the capturedimage obtained by the image obtaining unit 103. In the presentembodiment, the luminance value for each the pixel is extracted as afeature.

The model holding unit 106 holds the model information for a pluralityof comparison targets, which are objects that are compared to the targetobject. One piece of the model information is comprised by an image thatincludes an image of the comparison target, and an identifier thatidentifies the type of the comparison target. In other words, the modelholding unit 106 performs image holding, and holds a plurality of imagesthat include images of comparison targets. In the present embodiment, animage of a comparison target is a projection image obtained byprojecting the three-dimensional model of the comparison target onto thecaptured image. The image of the comparison target may be a capturedimage obtained by capturing the comparison target. Also, the modelholding unit 106 can hold a plurality of pieces of model information forone comparison target for cases where the relative position andorientation of the comparison target with respect to the image capturingapparatus differs.

The reliability calculation unit 105 calculates the reliability for eachpixel of the projection image that the model holding unit 106 holds. Inthe present embodiment, the reliability calculation unit 105 calculatesa statistic value, for example an average value, of the deteriorationdegree, from the deterioration degree for each pixel of the projectionimage. Similarly to the first embodiment, the reliability calculationunit 105 then uses a Tukey function to calculate the reliability of eachpixel. As explained previously, a method of calculating the reliabilityis not particularly limited if the reliability becomes low to the extentthat the deterioration degree is high. For example, configuration may betaken to calculate the reliability of each pixel by using the Tukeyfunction and a predetermined threshold. Configuration may be taken touse the calculated statistic value to set a common reliability withrespect to all pixel positions. As the statistic value, it is alsopossible to use a median value, a standard deviation, or the like.

The matching unit 107 specifies the type of the target object includedin the captured image obtained by the image obtaining unit 103. Morespecifically, the matching unit 107 determines a similarity of thecaptured image obtained by the image obtaining unit 103 and theprojection image that the model holding unit 106 holds. In the presentembodiment, the matching unit 107 weights, in accordance with thereliability for each pixel position, the difference between the pixelvalue of the projection image and the pixel value of the captured imagefor each pixel position. The degree of matching between the image of thetarget object and the image of the comparison target is then calculatedbased on the weighted difference obtained for each pixel position. Thethus obtained degree of matching is based on a difference with acorresponding feature of a comparison target image for each of aplurality of features of the image of the target object, which isweighted in accordance with the deterioration degree corresponding tothe position at which the feature is extracted.

In one example, the matching unit 107 uses a normalizedcross-correlation function shown in Equation (25) to calculate asimilarity NCC between the luminance value obtained by the featureextraction unit 104 and the luminance value of the projection image thatthe model holding unit 106 holds. It is determined that the similarityis high to the extent that the value of NCC calculated in accordancewith Equation (25) is close to 1.

$\begin{matrix}\left\lbrack {{EQUATION}\mspace{14mu} 25} \right\rbrack & \; \\{{NCC} = \frac{\sum\limits_{j = 0}^{N - 1}\; {\sum\limits_{i = 0}^{M - 1}\; {\left( {{W\left( {i,j} \right)}{I\left( {i,j} \right)}} \right)\mspace{14mu} {T\left( {i,j} \right)}}}}{\sqrt{\sum\limits_{j = 0}^{N - 1}\; {\sum\limits_{i = 0}^{M - 1}\; {\left( {{W\left( {i,j} \right)}{I\left( {i,j} \right)}} \right)^{2} \times {\sum\limits_{j = 0}^{N - 1}\; {\sum\limits_{i = 0}^{M - 1}\; {T\left( {i,j} \right)}^{2}}}}}}}} & (25)\end{matrix}$

In Equation (25), i and j indicate a pixel position, and functions I andT respectively indicate the luminance value of the projection image thatthe model holding unit 106 holds and the luminance value obtained by thefeature extraction unit 104. In addition, a function W indicates thereliability calculated by the reliability calculation unit 105.

The matching unit 107 calculates the NCC with respect to each theprojection image that the model holding unit 106 holds. The matchingunit 107 specifies the projection image for which the value of NCC isclosest to 1. Finally, the matching unit 107 specifies the type of thetarget object by using the identifier associated with the specifiedprojection image. The matching unit 107 can output information thatindicates the type of the specified target object. If the value of NCCdoes not exceed a threshold, the matching unit 107 can determine that anobject of a registered type is not present in the captured image.

Next, explanation is given of processing in the present embodiment, withreference to the flowchart of FIG. 2. The processing of step S201 isperformed similarly to in the first embodiment. In steps S202 and S203,the deterioration degree calculation unit 101 and the deteriorationdegree holding unit 102 calculate and hold the deterioration degree asdescribed above. The processing of step S204 is performed similarly toin the first embodiment.

In step S205, the feature extraction unit 104 extracts as a feature eachpixel value on the captured image obtained in step S204. In step S206,as described above, the matching unit 107 uses the features extracted instep S205, the projection image that the model holding unit 106 holds,and reliabilities calculated by the reliability calculation unit 105 toperform recognition of the target object.

By the above processing, it is possible to specify the type of thetarget object. By applying the present embodiment, it is furtherpossible to determine the position and orientation of the target object.For example, one piece of model information that the model holding unit106 holds may be configured by a projection image, an identifier, andinformation indicating the relative position and orientation of theimage capturing apparatus with respect to the comparison target. In sucha case, it is possible for the matching unit 107 to use the informationthat indicates the relative position and orientation associated with thespecified projection image to determine the position and orientation ofthe target object.

According to the fifth embodiment, it is possible to performspecification of the type of the target object at high precision bycalculating the degree of matching between the captured image and theprojection image, while performing weighting with respect to each pixelby considering degradation of the captured image in accordance with thebokeh and the blur, or the like.

[Tenth Variation]

In the fifth embodiment, recognition of the target object is performedbased on the similarity between the captured image of the target objectand a projection image that the model holding unit 106 holds. In thetenth variation, recognition of the target object is performed by usinga similarity of SIFT features. The information processing apparatus 1according to the fifth embodiment has a similar configuration to that ofthe first embodiment; explanation is given below of points ofdifference.

The feature extraction unit 104 extracts SIFT feature amounts from thecaptured image obtained by the image obtaining unit 103. For example,the feature extraction unit 104 can detect feature points (key points)from the captured image, and calculate a SIFT feature amount at each keypoint. A method of detecting key points and a method of calculating SIFTfeature amounts are publicly known. As one example in the presentembodiment, the method disclosed in Hironobu Fujiyoshi, “Local FeatureAmounts for Generic Object Recognition (SIFT and HOG)” InformationProcessing Institute Research Report, CVIM 160, pp. 211-224, 2007 isused.

The model holding unit 106 holds the model information for the pluralityof comparison targets. One piece of the model information is comprisedby SIFT feature amounts extracted from the projection image obtained byprojecting the three-dimensional model of the comparison target on thecaptured image, and an identifier that specifies the type of thecomparison target. In the present embodiment, one piece of modelinformation includes a plurality of SIFT feature amounts. Also, themodel holding unit 106 can hold a plurality of pieces of modelinformation for one comparison target for cases where the relativeposition and orientation of the comparison target with respect to theimage capturing apparatus differs.

The matching unit 107 specifies a type of a target object included inthe captured image obtained by the image obtaining unit 103. Morespecifically, the matching unit 107, first selects one piece of themodel information that the model holding unit 106 holds, and thenassociates a SIFT feature amount included in the selected modelinformation with a SIFT feature amount extracted by the featureextraction unit 104. A method of associating the SIFT feature amounts ispublicly known, and as one example in the present embodiment, the methoddisclosed in Hironobu Fujiyoshi, “Local Feature Amounts for GenericObject Recognition (SIFT and HOG)” Information Processing InstituteResearch Report, CVIM 160, pp. 211-224, 2007 is used.

When performing association of SIFT feature amounts, it is possible toassociate a SIFT feature amount that the model holding unit 106 holdsand a SIFT feature amount extracted from the captured image obtained bythe image obtaining unit 103, for which the distance is close. Thisassociating can be performed independently of a pixel position of a keypoint. In the first to fifth embodiments, features whose positions areclose are associated, but it is possible to use various methods as themethod of associating the features in this way.

Next, the matching unit 107 obtains a reliability for associated SIFTfeature amounts. In the present variation, similarly to in the fifthembodiment, the reliability calculation unit 105 calculates thereliability for each pixel of the projection image that the modelholding unit 106 holds. The matching unit 107 obtains the reliabilityfor a pixel of a key point corresponding to a SIFT feature amountextracted from the captured image from the reliability calculation unit105. The matching unit 107 calculates, as a similarity R, the sum totalof weighted distances for which obtained reliability is multiplied as aweight with the Euclidean distances between associated SIFT features.

The matching unit 107 performs the above described processing tocalculate the similarity R for each piece of model information. Thematching unit 107 then specifies the model information for which thesimilarity R becomes a minimum, and specifies the type of the targetobject in accordance with an identifier associated with the specifiedmodel information. The matching unit 107 can output information thatindicates the type of the specified target object. In the presentvariation, similarly to in the fifth embodiment, it is possible to usethe information that indicates the relative position and orientationassociated with the specified model information to determine theposition and orientation of the target object.

SIFT feature amounts are used in the present variation, but the type ofthe target object can be specified using other features extracted fromthe image. For example, it is possible to use an edge feature, athree-dimensional point calculated based on the captured image, or thelike, as the feature.

Sixth Embodiment

In the above described embodiment, each processing unit shown in, forexample, FIG. 1 or the like is realized by dedicated hardware. However,a portion or all of the processing units may be realized by a computer.In the present embodiment, at least a portion of the processingaccording to each of the above described embodiments is executed by acomputer. FIG. 15 is a view that shows a basic configuration of acomputer. To execute the above described functions of each embodiment inthe computer, each functional configuration may be expressed by aprogram and read by the computer. Thus, it is possible to realize eachfunction of the above described embodiments by the computer. In such acase, each component in FIG. 15 may be caused to function by a functionor a subroutine that the CPU executes.

In addition, the computer program is ordinarily stored in anon-transitory computer-readable storage medium, such as a CD-ROM. Thecomputer program can be executed by setting the storage medium into areading apparatus (such as a CD-ROM drive) that the computer has, andcopying or installing it into the system. Accordingly, it is clear thatthe corresponding non-transitory computer-readable storage medium iswithin the scope of the present invention.

FIG. 15 is a view that shows a basic configuration of a computer. Aprocessor 1510 in FIG. 15 is for example a CPU, and controls operationof the computer on the whole. A memory 1520 is for example a RAM, andtemporarily stores a program, data, or the like. A non-transitorycomputer-readable storage medium 1530 is for example a hard disk or aCD-ROM, and stores a program and data, or the like for long periods. Inthe present embodiment, a program, which realizes the function of eachunit and is stored in the storage medium 1530, is read into the memory1520. The processor 1510 realizes the function of each unit by operatingin accordance with the program in the memory 1520.

In FIG. 15, an input interface 1540 is an interface for obtaininginformation from an external apparatus, and is, for example, connectedto an operation panel 112 or the like. Also an output interface 1550 isan interface for outputting information to an external apparatus, andis, for example, connected an LCD monitor 113, or the like. A bus 1560connects each above described unit, and enables exchange of data.

OTHER EMBODIMENTS

Embodiment(s) of the present invention can also be realized by acomputer of a system or apparatus that reads out and executes computerexecutable instructions (e.g., one or more programs) recorded on astorage medium (which may also be referred to more fully as a‘non-transitory computer-readable storage medium’) to perform thefunctions of one or more of the above-described embodiment(s) and/orthat includes one or more circuits (e.g., application specificintegrated circuit (ASIC)) for performing the functions of one or moreof the above-described embodiment(s), and by a method performed by thecomputer of the system or apparatus by, for example, reading out andexecuting the computer executable instructions from the storage mediumto perform the functions of one or more of the above-describedembodiment(s) and/or controlling the one or more circuits to perform thefunctions of one or more of the above-described embodiment(s). Thecomputer may comprise one or more processors (e.g., central processingunit (CPU), micro processing unit (MPU)) and may include a network ofseparate computers or separate processors to read out and execute thecomputer executable instructions. The computer executable instructionsmay be provided to the computer, for example, from a network or thestorage medium. The storage medium may include, for example, one or moreof a hard disk, a random-access memory (RAM), a read only memory (ROM),a storage of distributed computing systems, an optical disk (such as acompact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)™),a flash memory device, a memory card, and the like.

While the present invention has been described with reference toexemplary embodiments, it is to be understood that the invention is notlimited to the disclosed exemplary embodiments. The scope of thefollowing claims is to be accorded the broadest interpretation so as toencompass all such modifications and equivalent structures andfunctions.

This application claims the benefit of Japanese Patent Application No.2014-242301, filed Nov. 28, 2014, which is hereby incorporated byreference herein in its entirety.

What is claimed is:
 1. An image processing apparatus comprising: animage obtaining unit configured to obtain a captured image of a targetobject that is captured by an image capturing apparatus; a deteriorationdegree obtaining unit configured to obtain information that indicates adeterioration degree of the captured image, for a position in thecaptured image; an extraction unit configured to extract a feature ofthe target object from the captured image based on the deteriorationdegree; a model holding unit configured to hold a three-dimensionalmodel of the target object; an associating unit configured to associatethe feature of the target object and a feature of the three-dimensionalmodel observed when the three-dimensional model is arranged inaccordance with a predetermined position and orientation; and a derivingunit configured to derive a position and orientation of the targetobject with respect to the image capturing apparatus by correcting thepredetermined position and orientation based on a result of association.2. The image processing apparatus according to claim 1, wherein thedeterioration degree obtaining unit is further configured to obtain thedeterioration degree from a deterioration degree holding unit configuredto hold in advance information that indicates an image deteriorationdegree for each position of a captured image that is captured by animage capturing apparatus.
 3. The image processing apparatus accordingto claim 2, wherein, the deterioration degree holding unit is furtherconfigured to hold the information that indicates the deteriorationdegree in association with a position and orientation between the targetobject and the image capturing apparatus.
 4. The image processingapparatus according to claim 2, wherein the deterioration degree holdingunit is further configured to hold information indicating adeterioration degree, of a captured image obtained by an image capturingapparatus by capturing the target object, at a position of each featureof an image of the target object in association with a feature that thethree-dimensional model comprises that corresponds to the respectivefeature.
 5. The image processing apparatus according to claim 1, whereinthe deriving unit is further configured to correct the predeterminedposition and orientation to make a distance between the feature of thethree-dimensional model and the feature of the target object that areassociated by the associating unit small.
 6. The image processingapparatus according to claim 1, wherein the feature is an edge feature,the extraction unit is further configured to extract a plurality of edgefeatures by performing edge detection processing on the captured image,and the associating unit is further configured to, for each of aplurality of edges that the three-dimensional model has, calculate animage position obtained by projecting the edge on a projection imagebased on the predetermined position and orientation, and associates animage position of edge feature of an image of the target object in thecaptured image with an image position of an edge of thethree-dimensional model on the projection image which neighbors the edgefeature of the image of the target object.
 7. The image processingapparatus according to claim 1, wherein the captured image is obtainedby the image capturing apparatus capturing the target object, on whichan illumination pattern is irradiated by an irradiation apparatus, theextraction unit is further configured to extract as the feature athree-dimensional position of a point on an image of the target object,based on a position of the irradiation apparatus, a position of theimage capturing apparatus, and the illumination pattern, and theassociating unit is further configured to associate thethree-dimensional position of the point on the image of the targetobject and a three-dimensional position of a surface of thethree-dimensional model that neighbors that three-dimensional position.8. The image processing apparatus according to claim 1, wherein thederiving unit is further configured to derive the position andorientation of the target object with respect to the image capturingapparatus by correcting the predetermined position and orientation basedon the result of the associating and the deterioration degreecorresponding to a position at which the feature was extracted.
 9. Theimage processing apparatus according to claim 1, further comprising asetting unit configured to set an extraction parameter used to extractthe feature from the captured image in accordance with the deteriorationdegree, wherein the extraction unit is further configured to extract thefeature from the captured image by using the extraction parameter set bythe setting unit.
 10. The image processing apparatus according to claim9, wherein the setting unit is further configured to set the extractionparameter so that a feature is extracted by filtering processing thatuses a filter, wherein the higher the deterioration degree is, thelarger the size of the filter is.
 11. The image processing apparatusaccording to claim 9, wherein the setting unit is further configured toset the extraction parameter so that a feature is extracted by filteringprocessing after resizing the captured image, wherein the higher thedeterioration degree is, the smaller the captured image is resized. 12.The image processing apparatus according to claim 9, wherein the settingunit is further configured to set a plurality of extraction parametersin accordance with the deterioration degree, and the extraction unit isfurther configured to use each of the plurality of extraction parametersto extract a feature in accordance with filtering processing, andselects at least one extraction result from a plurality of extractionresults in accordance with a response value of filtering processing. 13.The image processing apparatus according to claim 1, wherein thedeterioration degree indicates at least one of a blur amount and a bokehamount of the image.
 14. The image processing apparatus according toclaim 1, further comprising a deterioration degree calculation unitconfigured to calculate the deterioration degree by using thethree-dimensional model of the target object, based on an imagecapturing condition of the target object according to the imagecapturing apparatus.
 15. The image processing apparatus according toclaim 1, further comprising a deterioration degree calculation unitconfigured to calculate the deterioration degree based on a capturedimage obtained by the image capturing apparatus capturing the targetobject.
 16. The image processing apparatus according to claim 1, furthercomprising a deterioration degree calculation unit configured toestimate a captured image obtained by the image capturing apparatuscapturing the target object using the three-dimensional model of thetarget object based on an image capturing condition of the target objectaccording to the image capturing apparatus, and calculating thedeterioration degree based on the estimated image.
 17. The imageprocessing apparatus according to claim 1, further comprising an imagecapturing unit configured to obtain the captured image by capturing thetarget object.
 18. The image processing apparatus according to claim 1,further comprising: an image capturing unit configured to obtain thecaptured image by capturing the target object; a robot arm comprising amovable shaft; and a control unit for controlling a position andorientation of the robot arm in accordance with the derived position andorientation of the target object.
 19. An image processing apparatuscomprising: an image obtaining unit configured to obtain a capturedimage of a target object that is captured by an image capturingapparatus; a deterioration degree obtaining unit configured to obtaininformation that indicates a deterioration degree of the captured imagefor a position in the captured image; a holding unit configured to holda plurality of comparison target images; and a determination unitconfigured to determine, from a plurality of the comparison targetimages, an image that corresponds to an image of the target object,based on a degree of matching between the image of the target object andcomparison target images from the plurality of the comparison targetimages, wherein the degree of matching is based on a difference with acorresponding feature of a comparison target image for each of aplurality of features of the image of the target object, which isweighted in accordance with the deterioration degree corresponding to aposition at which the feature is extracted.
 20. An image processingapparatus comprising: an image obtaining unit configured to obtain acaptured image of a target object that is captured by an image capturingapparatus; a deterioration degree obtaining unit configured to obtaininformation that indicates a deterioration degree of the captured image;a setting unit configured to set an extraction parameter used to extracta feature from the captured image in accordance with the deteriorationdegree; and an extraction unit configured to extract a feature of thecaptured image by using the extraction parameter set by the setting unitwith reference to the captured image.
 21. The image processing apparatusaccording to claim 20, wherein the deterioration degree obtaining unitis further configured to obtain information that indicates an imagedeterioration degree for a position in the captured image, and thesetting unit is further configured to, for the position of the capturedimage, set an extraction parameter used to extract a feature at theposition in accordance with the deterioration degree for the position.22. The image processing apparatus according to claim 20, wherein thedeterioration degree obtaining unit is further configured to obtain theinformation that indicates the deterioration degree, which is associatedwith a position and orientation between the target object and aviewpoint of the image capturing apparatus.
 23. The image processingapparatus according to claim 20, wherein the setting unit is furtherconfigured to set the extraction parameter so that a feature isextracted by filtering processing that uses a filter, wherein the higherthe deterioration degree is, the larger the size of the filter is. 24.The image processing apparatus according to claim 20, wherein thesetting unit is further configured to set the extraction parameter sothat a feature is extracted by filtering processing after resizing thecaptured image, wherein the higher the deterioration degree is, thesmaller the captured image is resized.
 25. The image processingapparatus according to claim 20, wherein the setting unit is furtherconfigured to set a plurality of extraction parameters in accordancewith the deterioration degree, and the extraction unit is furtherconfigured to use each of the plurality of extraction parameters toextract a feature in accordance with filtering processing, and selectsat least one extraction result from a plurality of extraction results inaccordance with a response value of filtering processing.
 26. An imageprocessing method comprising: obtaining a captured image of a targetobject that is captured by an image capturing apparatus; obtaininginformation that indicates a deterioration degree of the captured image,for a position in the captured image; extracting a feature of the targetobject from the captured image based on the deterioration degree;holding a three-dimensional model of the target object; associating thefeature of the target object and a feature of the three-dimensionalmodel observed when the three-dimensional model is arranged inaccordance with a predetermined position and orientation; and deriving aposition and orientation of the target object with respect to the imagecapturing apparatus by correcting the predetermined position andorientation based on a result of the associating.
 27. An imageprocessing method comprising: obtaining a captured image of a targetobject that is captured by an image capturing apparatus; obtaininginformation that indicates a deterioration degree of the captured imagefor a position in the captured image; holding a plurality of comparisontarget images; and determining, from a plurality of the comparisontarget images, an image that corresponds to an image of the targetobject, based on a degree of matching between the image of the targetobject and comparison target images from the plurality of the comparisontarget images, wherein the degree of matching is based on a differencewith a corresponding feature of a comparison target image for each of aplurality of features of the image of the target object, which isweighted in accordance with the deterioration degree corresponding to aposition at which the feature is extracted.
 28. An image processingmethod comprising: obtaining a captured image of a target object that iscaptured by an image capturing apparatus; obtaining information thatindicates a deterioration degree of the captured image; setting anextraction parameter used to extract a feature from the captured imagein accordance with the deterioration degree; and extracting a feature ofthe captured image by using the extraction parameter with reference tothe captured image.
 29. A non-transitory computer-readable mediumstoring a program thereon, wherein the program is configured to cause acomputer to: obtain a captured image of a target object that is capturedby an image capturing apparatus; obtain information that indicates adeterioration degree of the captured image, for a position in thecaptured image; extract a feature of the target object from the capturedimage based on the deterioration degree; hold a three-dimensional modelof the target object; associate the feature of the target object and afeature of the three-dimensional model observed when thethree-dimensional model is arranged in accordance with a predeterminedposition and orientation; and derive a position and orientation of thetarget object with respect to the image capturing apparatus bycorrecting the predetermined position and orientation based on a resultof association.
 30. A non-transitory computer-readable medium storing aprogram thereon, wherein the program is configured to cause a computerto: obtain a captured image of a target object that is captured by animage capturing apparatus; obtain information that indicates adeterioration degree of the captured image for a position in thecaptured image; hold a plurality of comparison target images; anddetermine, from a plurality of the comparison target images, an imagethat corresponds to an image of the target object, based on a degree ofmatching between the image of the target object and comparison targetimages from the plurality of the comparison target images, wherein thedegree of matching is based on a difference with a corresponding featureof a comparison target image for each of a plurality of features of theimage of the target object, which is weighted in accordance with thedeterioration degree corresponding to a position at which the feature isextracted.
 31. A non-transitory computer-readable medium storing aprogram thereon, wherein the program is configured to cause a computerto: obtain a captured image of a target object that is captured by animage capturing apparatus; obtain information that indicates adeterioration degree of the captured image; set an extraction parameterused to extract a feature from the captured image in accordance with thedeterioration degree; and extract a feature of the captured image byusing the extraction parameter with reference to the captured image. 32.An image processing apparatus comprising: an image obtaining unitconfigured to obtain a captured image of a target object that iscaptured by an image capturing apparatus; a deterioration degreeobtaining unit configured to obtain information that indicates adeterioration degree of the captured image, for a position in thecaptured image; an extraction unit configured to extract a feature ofthe target object from the captured image; a model holding unitconfigured to hold a three-dimensional model of the target object; anassociating unit configured to associate the feature of the targetobject and a feature of the three-dimensional model observed when thethree-dimensional model is arranged in accordance with a predeterminedposition and orientation; and a deriving unit configured to derive aposition and orientation of the target object with respect to the imagecapturing apparatus by correcting the predetermined position andorientation based on a result of association and the deteriorationdegree corresponding to a position at which the feature was extracted.33. An image processing method comprising: obtaining a captured image ofa target object that is captured by an image capturing apparatus;obtaining information that indicates a deterioration degree of thecaptured image, for a position in the captured image; extracting afeature of the target object from the captured image; holding athree-dimensional model of the target object; associating the feature ofthe target object and a feature of the three-dimensional model observedwhen the three-dimensional model is arranged in accordance with apredetermined position and orientation; and deriving a position andorientation of the target object with respect to the image capturingapparatus by correcting the predetermined position and orientation basedon a result of the associating and the deterioration degreecorresponding to a position at which the feature was extracted.
 34. Anon-transitory computer-readable medium storing a program thereon,wherein the program is configured to cause a computer to: obtain acaptured image of a target object that is captured by an image capturingapparatus; obtain information that indicates a deterioration degree ofthe captured image, for a position in the captured image; extract afeature of the target object from the captured image; hold athree-dimensional model of the target object; associate the feature ofthe target object and a feature of the three-dimensional model observedwhen the three-dimensional model is arranged in accordance with apredetermined position and orientation; and derive a position andorientation of the target object with respect to the image capturingapparatus by correcting the predetermined position and orientation basedon a result of association and the deterioration degree corresponding toa position at which the feature was extracted.